What is Toluu?
Toluu is a free service for sharing the feeds you read and discovering new ones.
Get Invite

High Scalability - Building bigger, faster, more reliable websites.


Paper: Sharding with Oracle DatabaseToday

The upshot of the paper is Oracle rules and MySQL sucks for sharding. Which is technically probable, if you don't throw in minor points like cost and ease of use. The points where they think Oracle wins: online schema changes, more robust replication, higher availability, better corruption handling, better use of large RAM and multiple cores, better and better tested partitioning features, better monitoring, and better gas mileage.

Sun Acquires Q-layer in Cloud Computing PlayYesterday

Datacenterknowledge.com: In an effort to boost its refocused cloud computing initiative, Sun Microsystems (JAVA) has acquired Q-layer, a Belgian provider that automates the deployment of both public and private clouds. Sun says Q-layer’s technology will help users instantly provision servers, storage, bandwidth and applications.

Do you have experience with Q-layers technology like its Virtual Private DataCenter and NephOS?

An Unorthodox Approach to Database Design : The Coming of the ShardJanuary 7

Update 2: Mr. Moore gets to punt on sharding by Alan Rimm-Kaufman of 37signals. Insightful article on design tradeoffs and the evils of premature optimization. With more memory, more CPU, and new tech like SSD, problems can be avoided before more exotic architectures like sharding are needed. Add features not infrastructure.
Update: Dan Pritchett shares some excellent Sharding Lessons: Size Your Shards, Use Math on Shard Counts, Carefully Consider the Spread, Plan for Exceeding Your Shards

Once upon a time we scaled databases by buying ever bigger, faster, and more expensive machines. While this arrangement is great for big iron profit margins, it doesn't work so well for the bank accounts of our heroic system builders who need to scale well past what they can afford to spend on giant database servers. In a extraordinary two article series, Dathan Pattishall, explains his motivation for a revolutionary new database architecture--sharding--that he began thinking about even before he worked at Friendster, and fully implemented at Flickr. Flickr now handles more than 1 billion transactions per day, responding in less then a few seconds and can scale linearly at a low cost.

What is sharding and how has it come


Messaging is not just for investment banksJanuary 6

It seems that HTTP calls have become a default way to think about distributed systems. HTTP and Web services definitely have a lot to offer, but they are not the only way to do things and there are definitely cases where web is not the right choice. Unfortunately, lots of people just stick with web services and hack on, trying to fit a square peg in a round hole. In cases such as these, a different distribution paradigm can save us quite a lot of time and effort both in development and later in maintenance. One of those different paradigms is messaging.

Lessons Learned at 208K: Towards Debugging Millions of CoresJanuary 6

How do we debug and profile a cloud full of processors and threads? It's a problem more will be seeing as we code big scary programs that run on even bigger scarier clouds. Logging gets you far, but sometimes finding the root cause of problem requires delving deep into a program's execution. I don't know about you, but setting up 200,000+ gdb instances doesn't sound all that appealing. Tools like STAT (Stack Trace Analysis Tool) are being developed to help with this huge task. STAT "gathers and merges stack traces from a parallel application’s processes." So STAT isn't a low level debugger, but it will help you find the needle in a million haystacks.

Abstract:


Petascale systems will present several new challenges to performance and correctness tools. Such machines may contain millions of cores, requiring that tools use scalable data structures and analysis algorithms to collect and to process application data. In addition, at such scales, each tool itself will become a large parallel application – already, debugging the full BlueGene/L (BG/L) installation at the Lawrence Livermore National Laboratory requires employing 1664 tool daemons. To reach such sizes and beyond, tools must use a scalable communication infrastructure and manage their own tool processes efficiently. Some system resources, such as the file system, may also become tool bottlenecks.

In this paper, we present challenges to petascale tool development, using