Scaleability Day 1 - NoSQL - Learning From Others

Scaleability Day 1 - NoSQL - Learning From Others

With rapidly growing performance and availability requirements,
contemporary web enterprises are facing difficulties accommodating a
huge user base with all entailing consequences. One possible solution to
address issues related to RDBMS is to reject the relational data model
in favor of the emerging non-relational paradigm generally referred by
the industry as the NoSQL model.
NoSQL is the industry's response to the inherent limitations within the
relational DNA.\
\ The largest players took upon endeavors of building their own solutions
to overcome data management challenges. Often these solutions evolved
into standalone products, which eventually became industry standards
(BigTable, Dynamo, FlockDB, Cassandra, etc.) Not everyone was prepared to deal with explosive growth.
A great example of this is Twitter. Granted, Twitters' original version
was intended as an internal tool for Odeo, and it was not expected to go
as big as it did. Twitter to this day continues to struggle periodically
with the wealth of traffic and data that crosses its cables.
\ As the Resultlyidea began to take shape, it was
apparent to us that as time progressed we would need to deal with a
large volume of data, collection, and users. Twitter and many other
startups we looked at handled this poorly in our opinion. We understood
that much of these problems lie in the design of the original Twitter
structure and its reliance on a traditional relational database.
Relational databases are inherently difficult to scale as growth of a
service or its data increases.
\ As we mentioned in our previous blogs, we expect Resulty to grow
rapidly. There is no doubt that there will be many pitfalls associated
with our growth, but we can avoid some traps by learning from others'
mistakes. The nature of our application imposes severe data throughput
requirements. Service availability will be crucial to our success,
especially during initial user acquisition phase. We learned Twitter's
lesson well, and we wanted to make the right choices from the start. It
became obvious that NoSQL is the appropriate data management framework
for our company.
\ Among several contenders we reviewed, Apache
Cassandra
was chosen. It is a free open
source data management system with many characteristics that Resultly is
expected to rely on. Resultly is written on a .NET C# and the
availability of .NET client for Cassandra
(Aquilies) was the final point in making this decision.
\ Similar to many other NoSQL solutions, Cassandra is built from the
ground up to handle massive amounts of data efficiently. Cassandra is
redundant ensuring high level of serviceavailability with no single point of failure. Data distribution and replication among
servers and clusters are handled automatically on the level of drivers.
It is highly scalable- adding new servers is a matter of a few
command lines without any modification to the application code.
\ Flexibility of NoSQL systems is, by far, the most important feature to
us. The NoSQL approach does not impose a rigorous predefined data
structure which may and should change dynamically in agile applications.
Changes to data schema (known as refactoring) are expensive and should
be avoided in the relativistic model. However, this becomes a non-issue
in NoSQL systems since there is no schema at least in the relational
sense.
\ In place of epilogue.
Resultly's personal experience with Cassandra was an adventure and an
intense brain workout. While learning about Cassandra and the realm of
NoSQL in general, we encountered quite a few warnings on the interwebs
concerning the difficulties one would face while switching from
SQL-based data organization thinking to the SQL-free approach. We had to
learn from scratch and the learning curve, we attest, is definitely
steep. Besides a few shortcomings (paging, sorting, etc), however, the
benefits NoSQL offers, are worth the challenge. \
\ At the end of the day we, realized that Cassandra's cold gaze is pointed not at the
software developers trying to get acquainted with her, but rather at the
emerging chaos in improperly organized information.

Article by Resultly

Close Window