<?xml version='1.0' encoding='UTF-8'?><?xml-stylesheet href="http://www.blogger.com/styles/atom.css" type="text/css"?><feed xmlns='http://www.w3.org/2005/Atom' xmlns:openSearch='http://a9.com/-/spec/opensearchrss/1.0/' xmlns:georss='http://www.georss.org/georss' xmlns:gd='http://schemas.google.com/g/2005' xmlns:thr='http://purl.org/syndication/thread/1.0'><id>tag:blogger.com,1999:blog-1936864837167875331</id><updated>2011-11-27T15:17:28.811-08:00</updated><category term='virtualization'/><category term='couchdb'/><category term='cassandra'/><category term='ssd'/><category term='rackspace'/><category term='riak'/><category term='RAISE'/><category term='programming'/><category term='hstack'/><category term='storage'/><category term='adobe'/><category term='bbc'/><category term='mongodb'/><category term='incident management'/><category term='dynamo'/><category term='event management'/><category term='cpus'/><category term='amazon'/><category term='neo4j'/><category term='anandtech'/><category term='motherboard'/><category term='the guardian'/><category term='nosql'/><category term='system maintenance'/><category term='hbase'/><category term='sandforce'/><title type='text'>on tech</title><subtitle type='html'>Random posts on SaaS, Software, and Server Hardware</subtitle><link rel='http://schemas.google.com/g/2005#feed' type='application/atom+xml' href='http://farmhead.blogspot.com/feeds/posts/default'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/1936864837167875331/posts/default?max-results=100'/><link rel='alternate' type='text/html' href='http://farmhead.blogspot.com/'/><link rel='hub' href='http://pubsubhubbub.appspot.com/'/><author><name>Farm</name><uri>http://www.blogger.com/profile/12178177852777050635</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><generator version='7.00' uri='http://www.blogger.com'>Blogger</generator><openSearch:totalResults>15</openSearch:totalResults><openSearch:startIndex>1</openSearch:startIndex><openSearch:itemsPerPage>100</openSearch:itemsPerPage><entry><id>tag:blogger.com,1999:blog-1936864837167875331.post-2080534987899245630</id><published>2010-05-05T09:51:00.000-07:00</published><updated>2010-05-06T11:48:11.239-07:00</updated><title type='text'>Planning for SaaS Infrastructure Failure – part 1</title><content type='html'>There's nothing quite like a good Single Point of Failure (SPOF) during a holiday dinner. Most folks I talk to immediately think of network redundancy when I bring up this topic and tend to look at me strangely when I talk about other parts of their infrastructure. I thought I'd throw a short post together on how to plan for failure in your SaaS infrastructure but, as usual, it's turned into a much longer post than I've intended and I'm going to break this post into three parts. Here we go with part 1.&lt;br /&gt;&lt;a name='more'&gt;&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;&lt;b&gt;&lt;/b&gt;&lt;br /&gt;&lt;b&gt;Start with a meteor strike&lt;/b&gt;&lt;br /&gt;Most people laugh when I say this but you should literally start your failure analysis with "what happens if a meteor strike takes out the building or power lines to our data center". In other words, start with a full disaster recovery situation and work your way all down the stack. If you think this is overkill, remember that two highly redundant data centers were both taken out by previously unknown single points of failure.&lt;br /&gt;&lt;br /&gt;Before jumping into these examples, I want to point out that Rackspace and 365 Main are two very well respected companies so this isn’t some fly-by-night operation that was taken out. Both providers also did an excellent job of publishing what happened on their websites after the event (see links below).&lt;br /&gt;&lt;br /&gt;&lt;b&gt;Rackspace Examples&lt;/b&gt;&lt;br /&gt;&lt;br /&gt;Rackspace suffered &lt;a href="http://www.rackspace.com/blog/?p=334"&gt;two outages in June and July of 2009 &lt;/a&gt;at their DFW facility. In the June outage, a power interruption resulted in a switch to generator power. So far so good as the backup systems kicked in. Unfortunately, the generators failed.&lt;br /&gt;&lt;br /&gt;Battery power only lasts for around 10-15 minutes in most of the data centers that I've worked at so you've got 10-15 minutes depending on the power draw (and the state of your UPS batteries) to get the generators running. Not much room for error there.&lt;br /&gt;&lt;br /&gt;In the &lt;a href="http://www.rackspace.com/blog/?p=334"&gt;July 2009 outage&lt;/a&gt;, Rackspace again switched to generator power due to a power interruption. Unfortunately, a "bus duct" prevented proper operation of a UPS and some customers lost power to their servers for about 20 minutes. They also suffered a loss of network connectivity due to the power disruption.&lt;br /&gt;&lt;br /&gt;In 2007, there was also the &lt;a href="http://www.datacenterknowledge.com/archives/2007/11/13/truck-crash-knocks-rackspace-offline/"&gt;infamous truck crash&lt;/a&gt; where a traffic accident damaged a nearby utility transformer, knocking out power to the Rackspace's Dallas facility. The company switched over to generator power but two chillers failed to start back up again. Servers had to be shutdown to prevent damage from overheating.&lt;br /&gt;&lt;br /&gt;&lt;b&gt;365 Main Example&lt;/b&gt;&lt;br /&gt;&lt;br /&gt;365 Main is a major data center smack in the middle of the San Francisco. This is probably starting to sound familiar -- a power interruption caused a fail-over to generators and &lt;a href="http://www.365main.com/status_update.html"&gt;three out of ten of the generators failed to start&lt;/a&gt;. This outage took down yelp, Craigslist, Technorati, LiveJournal, TypePad, and many others. In this case, the failure was due to a bad setting in the Detroit Diesel Electronic Controller (DDEC) for the generators. The setting "was not allowing the component to correctly reset its memory.&amp;nbsp; Erroneous data left in the DDEC’s memory subsequently caused misfiring or engine start failures on the next diesel engine call to start."&lt;br /&gt;&lt;br /&gt;That's right. The backup mechanism for power failure is a diesel generator and this diesel generator has a dependency on an electronic component, presumably with software running on it. I'm fairly ignorant of these diesel generators so let's assume that the DDEC has a redundant controller. Unfortunately, it would appear that the controller has a SPOF failure on a software setting -- or possibly a SPOF on a person that configured the DDEC setting.&lt;br /&gt;&lt;br /&gt;&lt;b&gt;I need a backup to my backup, which needs a backup...&lt;/b&gt;&lt;br /&gt;&lt;br /&gt;There's a pattern here. In all four of these incidents, a primary system failed and there was a problem with the backup system or a system that the backup system relied on. More importantly, the systems that failed were regularly tested. Diesel generators, UPS units, chillers, and CRAC (Computer Room Air Conditioning) units are rigorously maintained and tested at every data center that I've worked in and yet these failures still happened.&lt;br /&gt;&lt;br /&gt;Clearly, nodding your head when the colo sales person tells you that you’re getting into a tier-4 data center isn't enough due-diligence on your part. You need to understand a whole lot more about your data center infrastructure, design your physical site properly, and prepare for failure. The facility may be tier-4 but you can very easily wire all your critical systems to the same UPS and create a SPOF. You may even wire-up all of your racks with fully independent paths but then deploy your software across servers that have the same PDU.&lt;br /&gt;&lt;br /&gt;Now, a lot of you are probably thinking that you're just out of luck if the building has a power failure and you run into a double or triple failure but I don't believe that's the case unless the building has completely lost power or cooling. In all of the failure cases above, note that only a percentage of the data center was down at any one time. You can avoid being down if you plan for failure across your entire SaaS infrastructure (facility, hardware, and software).&lt;br /&gt;&lt;br /&gt;For example, if you assume that a PDU will fail (and include it as part of your failure analysis plan) you will verify that each power drop to your rack is from a separate UPS and that your service will stay up if a UPS battery bank explodes (yes, I’ve had this happen). You also won't assume that the receptacles to each rack are labeled correctly. You'll do a circuit breaker test. You won't even believe that the circuit breaker is rated appropriately and you'll do a power burn-in test before putting the rack into service to make sure you can hit 80% of the circuit without tripping a circuit breaker.&lt;br /&gt;&lt;br /&gt;Why do I torture colo vendors with these tests and why do we monitor everything including temperature/humidity on every rack and temperature of every hard disk in the data center?&lt;br /&gt;&lt;br /&gt;Because I've answered a phone call when a circuit breaker failed early, a CRAC unit had an airflow problem, rack power was partially lost after flipping the wrong circuit breaker because a circuit was labeled incorrectly, the batteries in a PDU exploded during a building utility breaker replacement, fuses blew when pushed into a not-so-forgiving rack PDU, redundant server power supplies started smoking after failing to failover, and an electrician accidentally turned off both circuit breakers to a rack on the launch of a major product (for the record, these circuits were labeled correctly, it was just a mistake on his part during maintenance).&lt;br /&gt;&lt;br /&gt;This doesn't even include the environmental problems like CRAC unit failures or network problems like an ISP black-holing all of your traffic between data centers (happened multiple times now). The truth is, almost every data center experiences these small events a few times a year but most SaaS application owners are either too small to be impacted by these problems or they aren't monitoring well enough to even notice a small event.&lt;br /&gt;&lt;br /&gt;Once you start getting into over 20 racks of gear or a few thousand hard disks, you’re going to start seeing regular hardware failures and you need to be prepared.&lt;br /&gt;&lt;br /&gt;&lt;b&gt;Now the good news&lt;/b&gt;&lt;br /&gt;&lt;br /&gt;The good news is that you can protect yourself from many of these problems. In part two of this post, I’ll talk about categories of failures (electrical, environmental, network, hardware, and software), the components of each category, and provide a mitigation step for each component. In part 3 of the post, I’ll see if I can post a checklist for reference.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/1936864837167875331-2080534987899245630?l=farmhead.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://farmhead.blogspot.com/feeds/2080534987899245630/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://farmhead.blogspot.com/2010/05/planning-for-saas-infrastructure.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/1936864837167875331/posts/default/2080534987899245630'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/1936864837167875331/posts/default/2080534987899245630'/><link rel='alternate' type='text/html' href='http://farmhead.blogspot.com/2010/05/planning-for-saas-infrastructure.html' title='Planning for SaaS Infrastructure Failure – part 1'/><author><name>Farm</name><uri>http://www.blogger.com/profile/12178177852777050635</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-1936864837167875331.post-6473149058916095196</id><published>2010-04-30T15:27:00.001-07:00</published><updated>2010-04-30T15:32:52.719-07:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='incident management'/><category scheme='http://www.blogger.com/atom/ns#' term='event management'/><category scheme='http://www.blogger.com/atom/ns#' term='system maintenance'/><title type='text'>Tips for Handling Events, Incidents, Outages, and Maintenance</title><content type='html'>I get a lot of questions from new service teams about what they should do to prevent downtime but very few people ask for advice on how to handle an incident. This is a bit like asking a boxer for the best way to avoid getting in the ring. It’s not a question of “if” you’re going to be in the ring but “when”. There’s an old saying – the more you bleed in the gym, the less you bleed in the ring and that definitely applies to incident management as well.&lt;br /&gt;&lt;br /&gt;Having sat in on more war rooms than I’d like to remember, I thought it might be handy to write down some of the things that my team has found useful over the years. I think every service organization should have a standard approach towards three specific activities:&lt;br /&gt;&lt;br /&gt;1.&amp;nbsp;&amp;nbsp;&amp;nbsp; &lt;a href="http://farmhead.blogspot.com/2010/04/tips-for-handling-service-incidents.html"&gt;Tips for Handling Service Incidents&lt;/a&gt; (just one service)&lt;br /&gt;2.&amp;nbsp;&amp;nbsp;&amp;nbsp; &lt;a href="http://farmhead.blogspot.com/2010/04/tips-for-handling-service-outages.html"&gt;Tips for Handling Service Outages&lt;/a&gt; (multiple services affected)&lt;br /&gt;3.&amp;nbsp;&amp;nbsp;&amp;nbsp; &lt;a href="http://farmhead.blogspot.com/2010/04/tips-and-tricks-for-system-maintenance.html"&gt;Tips for Handling System Maintenance&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;I hope these posts help you with your handling of incidents, outages, and maintenance. Success here is mostly about being prepared, being calm, good communication, and practice, practice, practice. If you think your service is bullet-proof and you won’t need the practice – you’re wrong :-)&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/1936864837167875331-6473149058916095196?l=farmhead.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://farmhead.blogspot.com/feeds/6473149058916095196/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://farmhead.blogspot.com/2010/04/tips-for-handling-events-incidents.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/1936864837167875331/posts/default/6473149058916095196'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/1936864837167875331/posts/default/6473149058916095196'/><link rel='alternate' type='text/html' href='http://farmhead.blogspot.com/2010/04/tips-for-handling-events-incidents.html' title='Tips for Handling Events, Incidents, Outages, and Maintenance'/><author><name>Farm</name><uri>http://www.blogger.com/profile/12178177852777050635</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-1936864837167875331.post-4740036477629880062</id><published>2010-04-30T15:26:00.000-07:00</published><updated>2010-05-06T11:48:55.338-07:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='incident management'/><title type='text'>Tips for Handling Service Incidents</title><content type='html'>This is the first post related to “Tips for Handling Events, Incidents, Outages, and Maintenance”.&lt;br /&gt;&lt;br /&gt;Incidents with a single service are probably the most common type of incident and usually get resolved fairly quickly. Examples here are a blown power supply that results in server failover (or not), bad GBICs that result in dropped packets, daemons that crashed, etc. &lt;br /&gt;&lt;br /&gt;The incident usually starts with a email, a text message, a phone call from the NOC, or one of your clients popping their head in your office with their hair on fire. You need to get your head straight and quickly run through a simple process:&lt;br /&gt;&lt;br /&gt;&lt;span style="font-size: small;"&gt;&lt;b&gt;&lt;a name='more'&gt;&lt;/a&gt;Get your head straight&lt;/b&gt;&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;First, stay calm. The worst thing you could do is cause a major outage, destroy some data, or make the existing problem worse in a panic. Simple problems can easily become large complicated problems after a few bad decisions made in haste. Take a breath before continuing. This is especially important with a page at 3AM or if a panicky client is in your office. Tell the client you’ll handle the problem and run through your normal procedure. &lt;br /&gt;&lt;br /&gt;Don’t be a hero. Get someone else to run the incident if your judgment is impaired due to lack of sleep, alcohol, or medication.&lt;br /&gt;&lt;br /&gt;Remember the prime directive – your job is to restore service as quickly as possible. You are not there to debug interesting problems with your service.&lt;br /&gt;&amp;nbsp; &lt;br /&gt;&lt;span style="font-size: small;"&gt;&lt;b&gt;Solve the problem immediately if it’s a simple problem and you can do it in under a minute&lt;/b&gt;&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;It can take up to five minutes to get into a room, get the right people on the line, etc. Just fix it and send out an email after the fact if you can fix it in less than a minute. &lt;br /&gt;&lt;br /&gt;For example, let’s say you get an alert from your performance monitoring system that some of your connections are timing out followed by an alert that one of your webservers is running out of memory (but all other webservers are looking fine). Let’s say a large job on the webserver is creating an intermittent high load but it’s not enough to yank the machine out of the load balancer. By all means, yank this webserver out of the load balancer pool so new connections aren’t forwarded to that machine to restore service quickly. &lt;br /&gt;&lt;br /&gt;In doing so, you’ve resolved an intermittent problem with the service and you can then send out an advisory alert before debugging what went wrong or coming up with a better solution to the load balancer monitor (or you can go back to sleep). &lt;br /&gt;&lt;br /&gt;The danger here is letting this one minute investigation turn into five or ten minutes. If you can’t fix it in a minute, get busy with the normal process.&amp;nbsp; &lt;br /&gt;&lt;br /&gt;&lt;span style="font-size: small;"&gt;&lt;b&gt;Do the normal process&lt;/b&gt;&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;Determine the severity, communicate an alert, and get backup. &lt;br /&gt;&lt;br /&gt;I’m always stunned when I see someone take a call at 3AM and jump right in to fixing the problem without assessing whether it can wait until morning. Determining the severity&amp;nbsp; is a critical step in ensuring that you don’t turn a problem in the middle of the night into a much larger problem. If it can wait until business hours when you’re awake and more people are around for support, by all means wait until the morning. &lt;br /&gt;&lt;br /&gt;You also want backup if it’s an extended event (more than five minutes). The extra person can take over communication duties while you resolve the problem. This two-person approach works well. First, you don’t forget to send regular updates and second you have a co-worker to bounce ideas off of. Just talking through a problem with a co-worker can often help identify the problem. Don’t be a hero. Get help. &lt;br /&gt;&lt;br /&gt;Make sure you have a screen sharing room and phone conference reserved for your team. You can’t waste time passing around bad passwords or having multiple people setup phone conference whack-a-mole during an outage. Think of this like a fire drill. Everyone should know how to exit the building and where to assemble. &lt;br /&gt;&lt;br /&gt;A normal process means you’ve gone through this before. If you haven’t – practice with simulated events! Also make sure you have standard email templates on multiple servers or your laptop. It’s critical that your email template identifies the incident severity, a short description, and the time of your next update. This will prevent panicky people from bugging you with status requests. &lt;br /&gt;&lt;br /&gt;Identifying what is actually wrong is an entire art in itself and something I’ll try to address in a future post &lt;br /&gt;&lt;ul&gt;&lt;ul&gt;&lt;/ul&gt;&lt;/ul&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/1936864837167875331-4740036477629880062?l=farmhead.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://farmhead.blogspot.com/feeds/4740036477629880062/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://farmhead.blogspot.com/2010/04/tips-for-handling-service-incidents.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/1936864837167875331/posts/default/4740036477629880062'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/1936864837167875331/posts/default/4740036477629880062'/><link rel='alternate' type='text/html' href='http://farmhead.blogspot.com/2010/04/tips-for-handling-service-incidents.html' title='Tips for Handling Service Incidents'/><author><name>Farm</name><uri>http://www.blogger.com/profile/12178177852777050635</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-1936864837167875331.post-1677819298068738429</id><published>2010-04-30T15:23:00.000-07:00</published><updated>2010-05-06T11:49:28.113-07:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='incident management'/><category scheme='http://www.blogger.com/atom/ns#' term='event management'/><title type='text'>Tips for Handling Service Outages (multiple services affected)</title><content type='html'>This is the second post related to “Tips for Handling Events, Incidents, Outages, and Maintenance”.&lt;br /&gt;&lt;br /&gt;You’re about to have a interesting day/night. Multiple business-critical services are offline or having intermittent problems. It affects revenue, your company-wide-outage processes have been started, and you’re the lucky person on the on-call roster to lead the outage.&lt;br /&gt;&lt;br /&gt;This is a different beast than resolving a problem with an single service. Here, you’re going to have to coordinate across several services and try to get the entire system up and running as soon as possible. The biggest obstacles here are coordination, communication, and discipline. This requires lots of practice before you get good in the role.&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;a name='more'&gt;&lt;/a&gt;I’ve resisted role terminology as much as possible up to this point but I’m going to introduce three roles here: the Incident Manager, the Service Manager, and the Communication Manager. There are lots of names for these roles and you can use any name as long as it’s clear that the Incident Manager is in charge of the entire incident, each Service Manager is responsible for reporting on their service, and the Communication Manager is responsible for handling communication with everyone else and pulling people in as necessary. It’s critical that everyone understand these roles and relationships or a phone call/chat session can turn into a nightmare.&lt;br /&gt;&lt;br /&gt;The Incident Manager’s job here is to drive communication, get regular updates from the Service Managers, and look for higher level patterns across the affected services. For example, let’s say three separate services are all having connectivity problems. Each Service Manager is going to be heads down looking at their particular service. The Incident Manager should be looking for a common theme here.&lt;br /&gt;&lt;br /&gt;Let’s say that all of the services are having network problems. An Incident Manager might start with the following questions: “Are the services having problems trying to connect to the same place? Are they having problems trying to reach other?” A quick view of a network architecture diagram can tell him what is in common and he can start asking the right questions or he can have the communication manager pull in the right team. In this case, let’s say all the services are connected to each other via a common VPN circuit between data centers and the VPN team isn’t on the call. The incident manager could then pull in the VPN team to verify the systems even if the VPN team’s monitors haven’t gone off. The point is that the Incident Manager has the bandwidth to pull people in and explore ideas while the Service Managers are busy troubleshooting their particular service. The Incident Manager is a critical part of restoring service as quickly as possible and it’s the bandwidth to look at the big picture that makes this possible.&lt;br /&gt;&lt;br /&gt;Here are some tips for Incident Managers:&lt;br /&gt;&lt;ul&gt;&lt;li&gt;Get your head straight (see previous post on &lt;a href="http://farmhead.blogspot.com/2010/04/tips-for-handling-service-incidents.html"&gt;Tips for Handling Service Incidents&lt;/a&gt;)&lt;/li&gt;&lt;li&gt;Stay calm and immediately address panic. Your demeanor will affect everyone on the call. Similarly, one panicky person can results in lost time or productivity. There’s no time for panic, stress, or sniping during a call. Address any of these problems immediately.&lt;/li&gt;&lt;li&gt;Control your call. Don’t give out the phone conference number. Non-essential people shouldn’t be joining your meeting. If they feel like they can contribute, they should go through the communication manager. &lt;/li&gt;&lt;li&gt;Don’t assume anything. “Should” is a dirty word during an outage. It’s your job as Incident Manager to validate assumptions while the teams are debugging their specific systems.&lt;/li&gt;&lt;li&gt;Enforce phone discipline. If someone is calling in and has a lot of background noise, tell them to mute, move to a quieter location, or switch to a different phone if possible.&lt;/li&gt;&lt;li&gt;Get comfortable with silence. Most people want to fill the void or speculate during these calls. Let the Service Managers do their job and spend your time as Incident Manager thinking and calling in the right people.&lt;/li&gt;&lt;li&gt;Do a roll-call every five minutes. You should read through the issues that you’re tracking and ask each Service Manager for an update. For example, “Five minute roll-call. We’re tracking two issues. Networking team. You’re a go for update. [update happens]. Thank you. Storage team. You’re a go for update…” These five minute updates help everyone stay focused on the problems and help provide context for anyone joining the call. Framing the list of issues also helps someone correct you if you’ve accidentally forgot about an issue.&lt;/li&gt;&lt;li&gt;Consider an ATC-like-model for communication. This is mostly my preference since I’m a pilot and a nerd but I find that enforcing communication discipline greatly helps during an outage as you don’t lose time to random conversation. Treat the phone as a precious shared resource – something that Service Managers shouldn’t hog with a long question or discussion. They should first request time from the Incident Manager. In the example below, note the control flow and that everyone positively acknowledges their assignment. There’s nothing worse than silence when you ask someone to do something. They should confirm that they understand you and are working on the task. In the following example, John is the Incident Manager and Dave is the Communication Manager:&lt;/li&gt;&lt;/ul&gt;&lt;blockquote&gt;&lt;b&gt;Storage team:&lt;/b&gt; “John, storage team has an update”&lt;br /&gt;&lt;b&gt;Incident Manager:&lt;/b&gt; “OK, thanks. Standby. Dave, please call in the VPN team on call engineer”&lt;br /&gt;&lt;b&gt;Communication Manager:&lt;/b&gt; “OK. I’m calling in the communication manager”&lt;br /&gt;&lt;b&gt;Incident Manager:&lt;/b&gt; “Thanks. Storage team. Go ahead” &lt;br /&gt;&lt;b&gt;Storage team:&lt;/b&gt; “We think there may be a network problem outside of our rack”&lt;br /&gt;&lt;b&gt;Incident Manager:&lt;/b&gt; “We’re thinking the same. We’re narrowing in on VPN. Please do a traceroute and get back to me with the results”&lt;br /&gt;&lt;b&gt;Storage team:&lt;/b&gt; “OK. I’ll callback with the traceroute”&lt;br /&gt;[Incident Manager adds this to his tracking list for the next five minute update]&lt;br /&gt;[Meanwhile, the communication manager is capturing relevant events and sending out updates as required]&lt;br /&gt;[Later, from the storage team…]&lt;br /&gt;&lt;b&gt;Storage team:&lt;/b&gt; “Storage team has the traceroute”&lt;br /&gt;&lt;b&gt;Incident Manager:&lt;/b&gt; “Go ahead storage”&lt;br /&gt;&lt;b&gt;Storage team:&lt;/b&gt; “it looks like we’re getting blackholed with our outbound carrier. We’re working with our colo provider on their BGP routes”&lt;br /&gt;&lt;b&gt;Incident Manager:&lt;/b&gt; “Thanks. I’ll track that. Network, please confirm the traceroute blackhole.&lt;br /&gt;[All this can happen in parallel before the VPN team even joins the call]&lt;/blockquote&gt;&lt;ul&gt;&lt;ul&gt;&lt;/ul&gt;&lt;li&gt;Carefully track your events. Pen and paper are great but a text editor and a screen-sharing program in a designated room (with Adobe Connect for example) are even better. You can post all information and updates in windows and everyone immediately sees the current status as soon as they join the room.&lt;/li&gt;&lt;li&gt;Make sure you are all referring to a single version of the truth. If you have a monitoring system and an executive dashboard, make sure everyone is looking at the monitoring system and not some fancy dashboard that was thrown together by another group.&lt;/li&gt;&lt;li&gt;Start preparing a shift rotation if it’s a multi-day event. The worst thing you can do is stay up for 48 hours. I’ve done this before and, while it seemed heroic at the time, it was just plain stupid. You want your people at their best if you’re going into extra innings.&lt;/li&gt;&lt;/ul&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/1936864837167875331-1677819298068738429?l=farmhead.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://farmhead.blogspot.com/feeds/1677819298068738429/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://farmhead.blogspot.com/2010/04/tips-for-handling-service-outages.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/1936864837167875331/posts/default/1677819298068738429'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/1936864837167875331/posts/default/1677819298068738429'/><link rel='alternate' type='text/html' href='http://farmhead.blogspot.com/2010/04/tips-for-handling-service-outages.html' title='Tips for Handling Service Outages (multiple services affected)'/><author><name>Farm</name><uri>http://www.blogger.com/profile/12178177852777050635</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-1936864837167875331.post-4845170038998393679</id><published>2010-04-30T15:20:00.000-07:00</published><updated>2010-05-06T11:51:41.062-07:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='system maintenance'/><title type='text'>Tips and Tricks for System Maintenance</title><content type='html'>Tips and Tricks for System Maintenance&lt;br /&gt;&lt;br /&gt;This is the third and final post related to “Tips for Handling Events, Incidents, Outages, and Maintenance”.&lt;br /&gt;&lt;br /&gt;System Maintenance is a more relaxed version of the previous posts but there’s still a high probability that you might cause an incident during your maintenance. The difference is that you actually get to plan for it. System upgrades and maintenance can be very complicated and involve multiple teams doing many things in parallel. I’ve seen several maintenance periods where there wasn’t a clear plan and people were working on things when they weren’t asked to causing all sorts of problems (imagine your network team killing the firewall before you diverted services to another cluster) and this is never a pretty situation.&lt;br /&gt;&lt;br /&gt;&lt;a name='more'&gt;&lt;/a&gt;As in the previous parts of this post, I highly recommend a second person and an ATC-like communication structure here to save time. I’ve probably taken this to an extreme, but I’ve gone with the flying/pilot/co-pilot metaphor:&lt;br /&gt;&lt;br /&gt;&lt;b&gt;Develop a flight-plan &lt;/b&gt;&lt;br /&gt;&lt;br /&gt;Write down exactly what the goal of the maintenance is, what you’re going to do, and when you’re going to do it (down to the five minutes where each task is going to occur). This helps uncover any gaps in your maintenance (like forgetting to let existing connections finish before restarting a server). &lt;br /&gt;&lt;br /&gt;&lt;b&gt;Plan for Bingo &lt;/b&gt;&lt;br /&gt;&lt;br /&gt;“Bingo” is the point in a flight where you have just enough fuel to get back to the airport. Pick a time that your maintenance has to be done or your rollback. This is mostly for disruptive maintenance but it’s also useful as part of a big upgrade. If you haven’t gotten it right by a certain time, you have to ask whether it’s better to roll-back or whether it’s too dangerous to rollback. The co-pilot role is responsible for enforcing “bingo” &lt;br /&gt;&lt;br /&gt;&lt;b&gt;Do a Preflight &lt;/b&gt;&lt;br /&gt;&lt;br /&gt;Review the maintenance at least one day prior. Everyone should be clear on the order of events and their role &lt;br /&gt;&lt;br /&gt;&lt;b&gt;Take-off &lt;/b&gt;&lt;br /&gt;&lt;br /&gt;Make sure you’re all at the same airport  In this case, make sure you document which data center and which cluster you’re working on. Don’t assume. Have everyone confirm that they are in a shell in the correct data center and cluster. &lt;br /&gt;&lt;br /&gt;Start the clock and the announcements so the co-pilot can track progress against the plan and send updates &lt;br /&gt;Read each line of the flight-plan before asking someone to do something. People can jump ahead and accidentally do a step out order. For complicated maintenance, you need to coordinate each activity. &lt;br /&gt;&lt;br /&gt;&lt;b&gt;Pass Control &lt;/b&gt;&lt;br /&gt;&lt;br /&gt;Consider a single control point if your maintenance is complicated with lots of people and coordination. In this model, only one person has the airplane at a time. This prevents multiple people from doing maintenance that interferes with each other. For example: &lt;br /&gt;&lt;br /&gt;&lt;blockquote&gt;&lt;b&gt;Pilot:&lt;/b&gt; “Storage, please cutover to the new cluster. You have control”.&lt;br /&gt;&lt;b&gt;Storage Team:&lt;/b&gt; “Cutting over to the new cluster. I have control”&lt;br /&gt;[The pilot then does the normal five minute updates as with an outage or refers to the expected time on the flight plan and ask the storage team for an update if that time expires]&lt;br /&gt;[Later, when the storage person is done]&lt;br /&gt;&lt;b&gt;Storage Team:&lt;/b&gt; “Storage cutover complete. Returning control”.&lt;br /&gt;&lt;b&gt;Pilot:&lt;/b&gt; "I have control. Network team, ..." &lt;/blockquote&gt;&lt;br /&gt;This is all very overkill with simple upgrades but, with complicated upgrades, I’ve seen this reduce confusion and prevent problems. In the example above, nobody is allowed to do any other type of maintenance while the storage team has control (think of it as a critical section for maintenance).&lt;br /&gt;&lt;br /&gt;On the other hand, sometimes you just don’t have this luxury due to time constraints and you need to run things in parallel. There’s nothing wrong with running things in parallel as long as you’ve planned appropriately and understand the risks. &lt;br /&gt;&lt;br /&gt;&lt;b&gt;Follow a Landing Checklist&lt;/b&gt;&lt;br /&gt;&lt;br /&gt;Always follow the appropriate checklist for restarting a service. This can often get skipped after maintenance if you’ve had an interesting upgrade. But, there’s nothing worse than forgetting to turn off a “maintenance” page and getting woken up hours later to find out that you caused multiple hours of downtime by missing this simple check. &lt;br /&gt;&lt;br /&gt;&lt;b&gt;Use Autopilot&lt;/b&gt;&lt;br /&gt;&lt;br /&gt;Finally, if your flight-plan is complicated – automate it! Run scripts instead of doing many manual steps.&lt;br /&gt;&lt;ul&gt;&lt;/ul&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/1936864837167875331-4845170038998393679?l=farmhead.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://farmhead.blogspot.com/feeds/4845170038998393679/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://farmhead.blogspot.com/2010/04/tips-and-tricks-for-system-maintenance.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/1936864837167875331/posts/default/4845170038998393679'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/1936864837167875331/posts/default/4845170038998393679'/><link rel='alternate' type='text/html' href='http://farmhead.blogspot.com/2010/04/tips-and-tricks-for-system-maintenance.html' title='Tips and Tricks for System Maintenance'/><author><name>Farm</name><uri>http://www.blogger.com/profile/12178177852777050635</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-1936864837167875331.post-3029697925784025369</id><published>2010-04-26T15:02:00.000-07:00</published><updated>2010-04-26T15:02:10.995-07:00</updated><title type='text'>HBase Performance Post</title><content type='html'>The folks at hstack.org posted a great overview of their latest &lt;a href="http://hstack.org/hbase-performance-testing/"&gt;HBase performance testing&lt;/a&gt;. There are some heard-earned performance testing nuggets in the article along with interesting random read/write and map-reduce stats. They also have a small peek at their hardware configuration (lots of spindles!).&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/1936864837167875331-3029697925784025369?l=farmhead.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://farmhead.blogspot.com/feeds/3029697925784025369/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://farmhead.blogspot.com/2010/04/hbase-performance-post.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/1936864837167875331/posts/default/3029697925784025369'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/1936864837167875331/posts/default/3029697925784025369'/><link rel='alternate' type='text/html' href='http://farmhead.blogspot.com/2010/04/hbase-performance-post.html' title='HBase Performance Post'/><author><name>Farm</name><uri>http://www.blogger.com/profile/12178177852777050635</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-1936864837167875331.post-5253646994965592173</id><published>2010-04-15T07:40:00.000-07:00</published><updated>2010-04-15T07:43:32.884-07:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='RAISE'/><category scheme='http://www.blogger.com/atom/ns#' term='ssd'/><category scheme='http://www.blogger.com/atom/ns#' term='anandtech'/><category scheme='http://www.blogger.com/atom/ns#' term='sandforce'/><title type='text'>SSD Primer and SandForce Controller</title><content type='html'>If you're new to SSDs or just want a fantastic in-depth review of the technology along with real-world testing, check out this &lt;a href="http://anandtech.com/show/2829"&gt;anandtech.com article&lt;/a&gt;. If you're looking for a more recent analysis of SSD products, checkout this &lt;a href="http://www.anandtech.com/show/2738"&gt;March article&lt;/a&gt;, also from anandtech.com.&lt;br /&gt;&lt;br /&gt;The SandForce controller has been getting a lot of press lately, following on the heels of the Indilinx controller. Here's a &lt;a href="http://www.anandtech.com/show/3656/corsairs-force-ssd-reviewed-sf1200-is-very-good/1"&gt;review of a new Corsair SSD with the SandForce controller&lt;/a&gt;.&lt;br /&gt;&lt;br /&gt;The &lt;a href="http://www.sandforce.com/index.php?id=3"&gt;SandForce controller&lt;/a&gt; takes a different approach than other SSD controllers and has introduced deduplication and the RAID concept to NAND cells. They call their RAID approach RAISE (Redundant Array of Independent Silicon Elements). Checkout &lt;a href="http://www.anandtech.com/show/2899/3"&gt;more here&lt;/a&gt;.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/1936864837167875331-5253646994965592173?l=farmhead.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://farmhead.blogspot.com/feeds/5253646994965592173/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://farmhead.blogspot.com/2010/04/ssd-primer-and-sandforce-controller.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/1936864837167875331/posts/default/5253646994965592173'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/1936864837167875331/posts/default/5253646994965592173'/><link rel='alternate' type='text/html' href='http://farmhead.blogspot.com/2010/04/ssd-primer-and-sandforce-controller.html' title='SSD Primer and SandForce Controller'/><author><name>Farm</name><uri>http://www.blogger.com/profile/12178177852777050635</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-1936864837167875331.post-6326095186777902476</id><published>2010-04-15T07:15:00.000-07:00</published><updated>2010-04-15T07:18:46.472-07:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='riak'/><category scheme='http://www.blogger.com/atom/ns#' term='neo4j'/><category scheme='http://www.blogger.com/atom/ns#' term='couchdb'/><category scheme='http://www.blogger.com/atom/ns#' term='the guardian'/><category scheme='http://www.blogger.com/atom/ns#' term='cassandra'/><category scheme='http://www.blogger.com/atom/ns#' term='nosql'/><category scheme='http://www.blogger.com/atom/ns#' term='amazon'/><category scheme='http://www.blogger.com/atom/ns#' term='mongodb'/><category scheme='http://www.blogger.com/atom/ns#' term='dynamo'/><category scheme='http://www.blogger.com/atom/ns#' term='rackspace'/><category scheme='http://www.blogger.com/atom/ns#' term='bbc'/><title type='text'>nosql:eu conference (April 20-22) looks very interesting</title><content type='html'>I'm really looking forward to some of the slides and posts from the upcoming nosql conference in London.&lt;br /&gt;&lt;br /&gt;The &lt;a href="http://nosqleu.com/#agenda"&gt;agenda &lt;/a&gt;is a drool-fest for any scalable KVP/metadata geek. I like how most of the talks are centered on real-world use of various technologies (Cassandra, CouchDB, Dynamo, HBase, MongoDB, Neo4j, Riak, etc.).&lt;br /&gt;&lt;br /&gt;Here's a copy of the agenda:&lt;br /&gt;&lt;br /&gt;&lt;table&gt;&lt;tbody&gt;&lt;tr&gt;&lt;td class="sql" colspan="2"&gt;Tuesday April 20&lt;/td&gt; &lt;/tr&gt;&lt;tr&gt; &lt;td class="dim" style="text-align: right;"&gt;08.30&lt;/td&gt; &lt;td&gt;&amp;nbsp;- Registration, Coffe and Mingle&lt;/td&gt; &lt;/tr&gt;&lt;tr&gt; &lt;td class="dim" style="text-align: right;"&gt;09.30&lt;/td&gt; &lt;td&gt;&amp;nbsp;- The Guardian's use of NoSQL - &lt;span class="bright"&gt;Matthew Wall&lt;/span&gt;,  &lt;span class="east"&gt;The Guardian&lt;/span&gt;&lt;/td&gt; &lt;/tr&gt;&lt;tr&gt; &lt;td class="dim" style="text-align: right;"&gt;10.30&lt;/td&gt; &lt;td class="dim"&gt;&amp;nbsp;- Coffee break and mingle&lt;/td&gt; &lt;/tr&gt;&lt;tr&gt; &lt;td class="dim" style="text-align: right;"&gt;10.50&lt;/td&gt; &lt;td&gt;&amp;nbsp;- An overview of NoSQL - &lt;a class="command speaker" href="http://nosqleu.com/#speaker/popescu"&gt;Alex Popescu&lt;/a&gt;, &lt;span class="east"&gt;MyNoSQL&lt;/span&gt;&lt;/td&gt; &lt;/tr&gt;&lt;tr&gt; &lt;td class="dim" style="text-align: right;"&gt;11.50&lt;/td&gt; &lt;td class="dim"&gt;&amp;nbsp;- Lunch break and mingle&lt;/td&gt; &lt;/tr&gt;&lt;tr&gt; &lt;td class="dim" style="text-align: right;"&gt;13.00&lt;/td&gt; &lt;td&gt;&amp;nbsp;- Key-value stores and Riak - &lt;a class="command speaker" href="http://nosqleu.com/#speaker/fink"&gt;Bryan  Fink&lt;/a&gt;, &lt;span class="east"&gt;Basho&lt;/span&gt;&lt;/td&gt; &lt;/tr&gt;&lt;tr&gt; &lt;td class="dim" style="text-align: right;"&gt;14.00&lt;/td&gt; &lt;td class="dim"&gt;&amp;nbsp;- Coffee break and mingle&lt;/td&gt; &lt;/tr&gt;&lt;tr&gt; &lt;td class="dim" style="text-align: right;"&gt;14.20&lt;/td&gt; &lt;td&gt;&amp;nbsp;- Document-oriented databases and MongoDB - &lt;a class="command speaker" href="http://nosqleu.com/#speaker/stearn"&gt;Mathias   Stearn&lt;/a&gt;, &lt;span class="east"&gt;10gen&lt;/span&gt;&lt;/td&gt; &lt;/tr&gt;&lt;tr&gt; &lt;td class="dim" style="text-align: right;"&gt;15.20&lt;/td&gt; &lt;td class="dim"&gt;&amp;nbsp;- Coffee break and mingle&lt;/td&gt; &lt;/tr&gt;&lt;tr&gt; &lt;td class="dim" style="text-align: right;"&gt;15.40&lt;/td&gt; &lt;td&gt;&amp;nbsp;- Column-oriented databases and Cassandra - &lt;span class="bright"&gt;Jonathan   Ellis&lt;/span&gt;&lt;span class="bright"&gt;&amp;nbsp;&lt;/span&gt;, &lt;span class="east"&gt;Rackspace&lt;/span&gt;&lt;/td&gt; &lt;/tr&gt;&lt;tr&gt; &lt;td class="dim" style="text-align: right;"&gt;16.40&lt;/td&gt; &lt;td class="dim"&gt;&amp;nbsp;- Coffee break and mingle&lt;/td&gt; &lt;/tr&gt;&lt;tr&gt; &lt;td class="dim" style="text-align: right;"&gt;17.00&lt;/td&gt; &lt;td&gt;&amp;nbsp;- Graph databases and Neo4j - &lt;span class="bright"&gt;&lt;a class="command speaker" href="http://nosqleu.com/#speaker/eifrem"&gt;Emil  Eifrem&lt;/a&gt;&lt;/span&gt;, &lt;span class="east"&gt;Neo Technology&lt;/span&gt;&lt;/td&gt; &lt;/tr&gt;&lt;tr&gt; &lt;td class="dim" style="text-align: right;"&gt;18.00&lt;/td&gt; &lt;td&gt;&amp;nbsp;- Evening party with loads of beer and mingle&lt;/td&gt; &lt;/tr&gt;&lt;tr&gt; &lt;td class="sql" colspan="2"&gt;&lt;br /&gt;Wednesday April 21&lt;/td&gt; &lt;/tr&gt;&lt;tr&gt; &lt;td class="dim" style="text-align: right;"&gt;08.30&lt;/td&gt; &lt;td class="dim"&gt;&amp;nbsp;- Coffee and mingle&lt;/td&gt; &lt;/tr&gt;&lt;tr&gt; &lt;td class="dim" style="text-align: right;"&gt;09.30&lt;/td&gt; &lt;td&gt;&amp;nbsp;- On the Birth of Dynamo - &lt;a class="command speaker" href="http://nosqleu.com/#speaker/vogels"&gt;Werner  Vogels&lt;/a&gt;, &lt;span class="east"&gt;Amazon&lt;/span&gt;&lt;/td&gt; &lt;/tr&gt;&lt;tr&gt; &lt;td class="dim" style="text-align: right;"&gt;10.30&lt;/td&gt; &lt;td class="dim"&gt;&amp;nbsp;- Coffee break and mingle&lt;/td&gt; &lt;/tr&gt;&lt;tr&gt; &lt;td class="dim" style="text-align: right;"&gt;10.50&lt;/td&gt; &lt;td&gt;&amp;nbsp;- Twitter's use of Cassandra, Pig and HBase - &lt;a class="command speaker" href="http://nosqleu.com/#speaker/weil"&gt;Kevin  Weil&lt;/a&gt;, &lt;span class="east"&gt;Twitter&lt;/span&gt;&lt;/td&gt; &lt;/tr&gt;&lt;tr&gt; &lt;td class="dim" style="text-align: right;"&gt;11.50&lt;/td&gt; &lt;td class="dim"&gt;&amp;nbsp;- Lunch break and mingle&lt;/td&gt; &lt;/tr&gt;&lt;tr&gt; &lt;td class="dim" style="text-align: right;"&gt;13.00&lt;/td&gt; &lt;td&gt;&amp;nbsp;- CouchDB at the BBC - &lt;a class="command speaker" href="http://nosqleu.com/#speaker/farrell"&gt;Enda Farrell&lt;/a&gt;, &lt;span class="east"&gt;BBC&lt;/span&gt;&lt;/td&gt; &lt;/tr&gt;&lt;tr&gt; &lt;td class="dim" style="text-align: right;"&gt;14.00&lt;/td&gt; &lt;td class="dim"&gt;&amp;nbsp;- Coffee break and mingle&lt;/td&gt; &lt;/tr&gt;&lt;tr&gt; &lt;td class="dim" style="text-align: right;"&gt;14.20&lt;/td&gt; &lt;td&gt;&amp;nbsp;- Why Big Enterprises are Interested in NoSQL - &lt;a class="command speaker" href="http://nosqleu.com/#speaker/moore"&gt;Jon Moore&lt;/a&gt;, &lt;span class="east"&gt;Comcast&lt;/span&gt;&lt;/td&gt; &lt;/tr&gt;&lt;tr&gt; &lt;td class="dim" style="text-align: right;"&gt;15.20&lt;/td&gt; &lt;td class="dim"&gt;&amp;nbsp;- Coffee break and mingle&lt;/td&gt; &lt;/tr&gt;&lt;tr&gt; &lt;td class="dim" style="text-align: right;"&gt;15.40&lt;/td&gt; &lt;td&gt;&amp;nbsp;- Memory as the New Disk: Why Redis Rocks - &lt;a class="command speaker" href="http://nosqleu.com/#speaker/lossen"&gt;Tim Lossen&lt;/a&gt;, &lt;span class="east"&gt;Wooga&lt;/span&gt;&lt;/td&gt; &lt;/tr&gt;&lt;tr&gt; &lt;td style="text-align: right;"&gt;&lt;span class="dim"&gt;15.55&lt;/span&gt;&lt;/td&gt; &lt;td&gt;&amp;nbsp;- Tokyo Cabinet, Tokyo Tyrant and Kyoto Cabinet - &lt;a class="command speaker" href="http://nosqleu.com/#speaker/inoue"&gt;Makoto Inoue&lt;/a&gt;&lt;/td&gt; &lt;/tr&gt;&lt;tr&gt; &lt;td style="text-align: right;"&gt;&lt;span class="dim"&gt;16.10&lt;/span&gt;&lt;/td&gt; &lt;td&gt;&amp;nbsp;- Thomas Kuhn Predicted the Fate of the Relational Database - &lt;a class="command speaker" href="http://nosqleu.com/#speaker/robbins"&gt;Neil  Robbins&lt;/a&gt;&lt;/td&gt; &lt;/tr&gt;&lt;tr&gt; &lt;td style="text-align: right;"&gt;&lt;span class="dim"&gt;16.25&lt;/span&gt;&lt;/td&gt; &lt;td&gt;&amp;nbsp;- Notes from the field: NoSQL tools in Production - &lt;a class="command speaker" href="http://nosqleu.com/#speaker/ford"&gt;Matthew  Ford&lt;/a&gt;&lt;/td&gt; &lt;/tr&gt;&lt;tr&gt; &lt;td class="dim" style="text-align: right;"&gt;16.40&lt;/td&gt; &lt;td class="dim"&gt;&amp;nbsp;- Coffee break and mingle&lt;/td&gt; &lt;/tr&gt;&lt;tr&gt; &lt;td class="dim" style="text-align: right;"&gt;17.00&lt;/td&gt; &lt;td&gt;&amp;nbsp;- Panel debate - Moderated by &lt;a class="command speaker" href="http://nosqleu.com/#speaker/governor"&gt;James Governor&lt;/a&gt;, &lt;span class="east"&gt;RedMonk&lt;/span&gt;&lt;/td&gt; &lt;/tr&gt;&lt;tr&gt; &lt;td class="sql" colspan="2"&gt;&lt;br /&gt;Thursday April 22&lt;/td&gt; &lt;/tr&gt;&lt;tr&gt; &lt;td class="dim" style="text-align: right;"&gt;08.30&lt;/td&gt; &lt;td&gt;&amp;nbsp;- Registration, Coffee and Mingle&lt;/td&gt; &lt;/tr&gt;&lt;tr&gt; &lt;td class="dim" style="text-align: right;"&gt;09.00&lt;/td&gt; &lt;td&gt;&amp;nbsp;- Morning workshops - Choose between:&lt;/td&gt; &lt;/tr&gt;&lt;tr&gt; &lt;td&gt;&lt;br /&gt;&lt;/td&gt; &lt;td&gt;&amp;nbsp;- MongoDB - &lt;a class="command speaker" href="http://nosqleu.com/#speaker/stearn"&gt;Mathias Stearn&lt;/a&gt;, &lt;span class="east"&gt;10gen&lt;/span&gt;&lt;/td&gt; &lt;/tr&gt;&lt;tr&gt; &lt;td&gt;&lt;br /&gt;&lt;/td&gt; &lt;td&gt;&amp;nbsp;- Riak - &lt;a class="command speaker" href="http://nosqleu.com/#speaker/fink"&gt;Bryan Fink&lt;/a&gt;, &lt;span class="east"&gt;Basho&lt;/span&gt;&lt;/td&gt; &lt;/tr&gt;&lt;tr&gt; &lt;td style="text-align: right;"&gt;&lt;span class="dim"&gt;12:30&lt;/span&gt;&lt;/td&gt; &lt;td class="dim"&gt;&amp;nbsp;- Lunch break and mingle&lt;/td&gt; &lt;/tr&gt;&lt;tr&gt; &lt;td class="dim" style="text-align: right;"&gt;13:30&lt;/td&gt; &lt;td&gt;&amp;nbsp;- Afternoon workshops - Choose between:&lt;/td&gt; &lt;/tr&gt;&lt;tr&gt; &lt;td class="dim"&gt;&lt;br /&gt;&lt;/td&gt; &lt;td&gt;&amp;nbsp;- Redis - Simon Willison, &lt;span class="east"&gt;The Guardian&lt;/span&gt;&lt;/td&gt; &lt;/tr&gt;&lt;tr&gt; &lt;td&gt;&lt;br /&gt;&lt;/td&gt; &lt;td&gt;&amp;nbsp;- Neo4j - &lt;a class="command speaker" href="http://nosqleu.com/#speaker/eifrem"&gt;Emil Eifrém&lt;/a&gt;, &lt;span class="east"&gt;Neo Technology&lt;/span&gt;&lt;/td&gt; &lt;/tr&gt;&lt;tr&gt; &lt;td class="dim" style="text-align: right;"&gt;17.00&lt;/td&gt; &lt;td&gt;&amp;nbsp;- Thank you and see you next year!&lt;/td&gt;&lt;/tr&gt;&lt;/tbody&gt;&lt;/table&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/1936864837167875331-6326095186777902476?l=farmhead.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://farmhead.blogspot.com/feeds/6326095186777902476/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://farmhead.blogspot.com/2010/04/nosqleu-conference-april-20-22-looks.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/1936864837167875331/posts/default/6326095186777902476'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/1936864837167875331/posts/default/6326095186777902476'/><link rel='alternate' type='text/html' href='http://farmhead.blogspot.com/2010/04/nosqleu-conference-april-20-22-looks.html' title='nosql:eu conference (April 20-22) looks very interesting'/><author><name>Farm</name><uri>http://www.blogger.com/profile/12178177852777050635</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-1936864837167875331.post-6854647399362758079</id><published>2010-04-13T11:35:00.000-07:00</published><updated>2010-04-13T14:36:11.614-07:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='hstack'/><category scheme='http://www.blogger.com/atom/ns#' term='nosql'/><title type='text'>NoSQL interview with the hstack.org folks</title><content type='html'>Another interesting &lt;a href="http://nosql.mypopescu.com/"&gt;interview &lt;/a&gt;with the hstack.org folks, this time from a NoSQL blog.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/1936864837167875331-6854647399362758079?l=farmhead.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://farmhead.blogspot.com/feeds/6854647399362758079/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://farmhead.blogspot.com/2010/04/nosql-interview-with-hstackorg-folks.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/1936864837167875331/posts/default/6854647399362758079'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/1936864837167875331/posts/default/6854647399362758079'/><link rel='alternate' type='text/html' href='http://farmhead.blogspot.com/2010/04/nosql-interview-with-hstackorg-folks.html' title='NoSQL interview with the hstack.org folks'/><author><name>Farm</name><uri>http://www.blogger.com/profile/12178177852777050635</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-1936864837167875331.post-3059515858153410647</id><published>2010-04-09T10:57:00.000-07:00</published><updated>2010-04-09T15:53:19.914-07:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='hbase'/><category scheme='http://www.blogger.com/atom/ns#' term='adobe'/><title type='text'>Great post on HBase and Adobe</title><content type='html'>Cosmin Lehene wrote a great 2-part post about his team's experience with HBase. Here are links to &lt;a href="http://hstack.org/why-were-using-hbase-part-1/"&gt;part 1&lt;/a&gt; and &lt;a href="http://hstack.org/why-were-using-hbase-part-2/"&gt;part 2&lt;/a&gt;. I hear one of his partners-in-crime, Andrei, is working on another interesting post on performance testing related to this work.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/1936864837167875331-3059515858153410647?l=farmhead.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://farmhead.blogspot.com/feeds/3059515858153410647/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://farmhead.blogspot.com/2010/04/great-post-on-hbase-and-adobe.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/1936864837167875331/posts/default/3059515858153410647'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/1936864837167875331/posts/default/3059515858153410647'/><link rel='alternate' type='text/html' href='http://farmhead.blogspot.com/2010/04/great-post-on-hbase-and-adobe.html' title='Great post on HBase and Adobe'/><author><name>Farm</name><uri>http://www.blogger.com/profile/12178177852777050635</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-1936864837167875331.post-6758916642896496767</id><published>2010-04-09T09:54:00.001-07:00</published><updated>2010-04-09T15:55:57.648-07:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='programming'/><title type='text'>FastFlow Parallel Programming Framework</title><content type='html'>I’ve been looking into Intel’s thread building blocks during the early morning hours here in Bucharest (jet-lag) and ran across an interesting library that provides non-blocking, lock-free, wait-free, synchronization mechanisms.&lt;br /&gt;&lt;br /&gt;Check out this tutorial page with small code snippets and some sample pipelines/farms:&lt;br /&gt;&lt;br /&gt;http://calvados.di.unipi.it/dokuwiki/doku.php?id=ffnamespace:usermanual&lt;br /&gt;&lt;br /&gt;Here are some background links:&lt;br /&gt;&lt;br /&gt;http://en.wikipedia.org/wiki/Fastflow_%28Computer_Science%29&lt;br /&gt;http://calvados.di.unipi.it/dokuwiki/doku.php?id=ffnamespace:about&lt;br /&gt;&lt;br /&gt;From the fastflow page:&lt;br /&gt;&lt;br /&gt;“FastFlow is a parallel programming framework for multi-core platforms based upon non-blocking lock-free/fence-free synchronization mechanisms. The framework is composed of a stack of layers that progressively abstracts out the programming of shared-memory parallel applications. The goal of the stack is twofold: to ease the development of applications and make them very fast and scalable. FastFlow is particularly targeted to the development of streaming applications.”&lt;br /&gt;&lt;br /&gt;From wikipeida:&lt;br /&gt;&lt;br /&gt;“Fastflow is implemented as a template library that offers a set of low-level mechanisms to support low-latency and high-bandwidth data flows in a network of threads running on a cache-coherent multi-core.[1] On these architectures, the key performance issues concern memory fences, which are required to keep the various caches coherent. Fastflow provides the programmer with two basic mechanisms: efficient communication channels and a memory allocator. Communication channels, as typical is in streaming applications, are unidirectional and asynchronous. They are implemented via lock-free (and memory fence-free) Multiple-Producer-Multiple-Consumer (MPMC) queues. The memory allocator is built on top of these queues, thus taking advantage of their efficiency.”&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/1936864837167875331-6758916642896496767?l=farmhead.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://farmhead.blogspot.com/feeds/6758916642896496767/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://farmhead.blogspot.com/2010/04/fastflow-parallel-programming-framework.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/1936864837167875331/posts/default/6758916642896496767'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/1936864837167875331/posts/default/6758916642896496767'/><link rel='alternate' type='text/html' href='http://farmhead.blogspot.com/2010/04/fastflow-parallel-programming-framework.html' title='FastFlow Parallel Programming Framework'/><author><name>Farm</name><uri>http://www.blogger.com/profile/12178177852777050635</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-1936864837167875331.post-3539711037930569314</id><published>2010-04-09T09:49:00.000-07:00</published><updated>2010-04-09T15:56:18.456-07:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='cpus'/><title type='text'>AMD 12-core opteron versus 6 core xeon</title><content type='html'>I'd like to have seen a larger set of tests thrown at this one, but you have to love all the auto-enthusiast references in this &lt;a href="http://anandtech.com/show/2978/amd-s-12-core-magny-cours-opteron-6174-vs-intel-s-6-core-xeon"&gt;anandtech.com review&lt;/a&gt; of the new 12-core Opteron versus the newer 6-core Xeon. &lt;br /&gt;&lt;br /&gt;That's two, 6-core Instanbul chips bolted together. Reminds me a bit of the Pentium D with a much larger cache coherency problem (imagine how much of a problem this is going to be as we keep adding cores to chips).&lt;br /&gt;&lt;br /&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://3.bp.blogspot.com/_RH_azLsKd_I/S79aLVGAGnI/AAAAAAAAAAs/_WPM3CuBTkk/s1600/amd12core.jpg" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" src="http://3.bp.blogspot.com/_RH_azLsKd_I/S79aLVGAGnI/AAAAAAAAAAs/_WPM3CuBTkk/s320/amd12core.jpg" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/1936864837167875331-3539711037930569314?l=farmhead.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://farmhead.blogspot.com/feeds/3539711037930569314/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://farmhead.blogspot.com/2010/04/amd-12-core-opteron-versus-6-core-xeon.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/1936864837167875331/posts/default/3539711037930569314'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/1936864837167875331/posts/default/3539711037930569314'/><link rel='alternate' type='text/html' href='http://farmhead.blogspot.com/2010/04/amd-12-core-opteron-versus-6-core-xeon.html' title='AMD 12-core opteron versus 6 core xeon'/><author><name>Farm</name><uri>http://www.blogger.com/profile/12178177852777050635</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><media:thumbnail xmlns:media='http://search.yahoo.com/mrss/' url='http://3.bp.blogspot.com/_RH_azLsKd_I/S79aLVGAGnI/AAAAAAAAAAs/_WPM3CuBTkk/s72-c/amd12core.jpg' height='72' width='72'/><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-1936864837167875331.post-3388321799312706424</id><published>2010-04-09T09:38:00.000-07:00</published><updated>2010-04-09T15:56:32.663-07:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='storage'/><title type='text'>New WD Velociraptor VR200M</title><content type='html'>WD has released their next generation VelociRaptor (10K RPM, 2.5" disk). It has a new 6Gbps interface and 600 GB of space. There's an interesting review comparing this disk versus a couple of non-enterprise SSDs &lt;a href="http://techreport.com/articles.x/18712/1"&gt;here&lt;/a&gt;.&lt;br /&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://2.bp.blogspot.com/_RH_azLsKd_I/S79XxqEehII/AAAAAAAAAAk/Tt4GQhTaVPk/s1600/vr200m.jpg" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" src="http://2.bp.blogspot.com/_RH_azLsKd_I/S79XxqEehII/AAAAAAAAAAk/Tt4GQhTaVPk/s320/vr200m.jpg" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/1936864837167875331-3388321799312706424?l=farmhead.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://farmhead.blogspot.com/feeds/3388321799312706424/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://farmhead.blogspot.com/2010/04/new-wd-velociraptor-vr200m.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/1936864837167875331/posts/default/3388321799312706424'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/1936864837167875331/posts/default/3388321799312706424'/><link rel='alternate' type='text/html' href='http://farmhead.blogspot.com/2010/04/new-wd-velociraptor-vr200m.html' title='New WD Velociraptor VR200M'/><author><name>Farm</name><uri>http://www.blogger.com/profile/12178177852777050635</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><media:thumbnail xmlns:media='http://search.yahoo.com/mrss/' url='http://2.bp.blogspot.com/_RH_azLsKd_I/S79XxqEehII/AAAAAAAAAAk/Tt4GQhTaVPk/s72-c/vr200m.jpg' height='72' width='72'/><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-1936864837167875331.post-148322621015878489</id><published>2010-04-09T09:31:00.000-07:00</published><updated>2010-04-09T15:56:58.574-07:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='virtualization'/><category scheme='http://www.blogger.com/atom/ns#' term='cpus'/><category scheme='http://www.blogger.com/atom/ns#' term='motherboard'/><title type='text'>SuperMicro 24 core motherboard</title><content type='html'>Speaking of &lt;a href="http://www.supermicro.com/products/motherboard/Xeon7000/7300/X7QC3.cfm"&gt;24 core motherboards&lt;/a&gt; with loads of RAM, I ran across this new SuperMicro motherboard the other day when doing some research. It's truly terrifying how many cores and RAM you can toss onto one box now.&lt;br /&gt;&lt;br /&gt;Assuming one core is dedicated to a Dom0, you could have 23 VMs each with a dedicated core and over 8GB or RAM if you add all 192GB of RAM.&lt;br /&gt;&lt;br /&gt;Here are some specs from the link above:&lt;br /&gt;&lt;ol&gt;&lt;li&gt;Quad Intel® 64-bit Xeon® MP Support 1066 MHz FSB&lt;/li&gt;&lt;li&gt;Intel® 7300 (Clarksboro) Chipset&lt;/li&gt;&lt;li&gt;Up to 192GB DDR2 ECC FB-DIMM (Fully Buffered DIMM)&lt;/li&gt;&lt;li&gt;Intel® 82575EB Dual-port Gigabit Ethernet Controller&lt;/li&gt;&lt;li&gt;LSI 1068e Dual Channel 8-Port SAS Controller&lt;/li&gt;&lt;li&gt;6x SATA (3 Gbps) Ports via ESB2 Controller&lt;/li&gt;&lt;li&gt;1 (x8) PCI-e (using X16 slot), 1 (x8) PCI-e (using x8 slot) &amp;amp; 1 (x4) PCI-e (using x8 slot) 1x 64-bit 133MHz PCI-X&lt;/li&gt;&lt;li&gt;ATI ES1000 Graphics with 32MB video memory&lt;/li&gt;&lt;li&gt;IPMI 2.0 (SIMSO) Slot&amp;nbsp; &lt;/li&gt;&lt;/ol&gt;&lt;br /&gt;&lt;span class="keyFeatures"&gt;&lt;/span&gt;&lt;br /&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://3.bp.blogspot.com/_RH_azLsKd_I/S79VjiqDrRI/AAAAAAAAAAc/Ol46BE4Bgq4/s1600/X7QC3_spec.jpg" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" src="http://3.bp.blogspot.com/_RH_azLsKd_I/S79VjiqDrRI/AAAAAAAAAAc/Ol46BE4Bgq4/s320/X7QC3_spec.jpg" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/1936864837167875331-148322621015878489?l=farmhead.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://farmhead.blogspot.com/feeds/148322621015878489/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://farmhead.blogspot.com/2010/04/supermicro-24-core-motherboard.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/1936864837167875331/posts/default/148322621015878489'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/1936864837167875331/posts/default/148322621015878489'/><link rel='alternate' type='text/html' href='http://farmhead.blogspot.com/2010/04/supermicro-24-core-motherboard.html' title='SuperMicro 24 core motherboard'/><author><name>Farm</name><uri>http://www.blogger.com/profile/12178177852777050635</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><media:thumbnail xmlns:media='http://search.yahoo.com/mrss/' url='http://3.bp.blogspot.com/_RH_azLsKd_I/S79VjiqDrRI/AAAAAAAAAAc/Ol46BE4Bgq4/s72-c/X7QC3_spec.jpg' height='72' width='72'/><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-1936864837167875331.post-1146957350763378677</id><published>2010-04-09T09:22:00.000-07:00</published><updated>2010-04-09T15:57:23.198-07:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='virtualization'/><category scheme='http://www.blogger.com/atom/ns#' term='ssd'/><category scheme='http://www.blogger.com/atom/ns#' term='storage'/><title type='text'>OCZ PCI-e SSD with field-replaceable MLC NAND</title><content type='html'>OCZ is ready to mass produce it’s &lt;a href="http://www.techspot.com/news/38516-ocz-to-begin-mass-producing-its-zdrive-pcie-ssd-line.html"&gt;PCI-e SSDs&lt;/a&gt; with field replaceable MLC NAND flash modules.&lt;br /&gt;&lt;br /&gt;This makes the MLC versus SLC debate a bit moot if you can just replace the NAND when it wears out like a bad disk. Did I mention that it has 8 separate Indlinx controllers, up to 2TBs of space, and has peak transfer rates of 1.4GB/s for reads and writes (that’s gigabytes not gigabits)? I can’t imagine what will happen with a Sandforce controller version of one of these monsters.&lt;br /&gt;&lt;br /&gt;This is some seriously interesting temporary storage for a virtualization cluster that needs some fast DAS. With 2 TB, you could carve up 87 gigabytes for 23 VMs on a 24-core virtualization box. That’s mighty interesting.&lt;br /&gt;&lt;br /&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://2.bp.blogspot.com/_RH_azLsKd_I/S79URzMZitI/AAAAAAAAAAU/FOogm4swfcU/s1600/ocz-z-drive.jpg" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" src="http://2.bp.blogspot.com/_RH_azLsKd_I/S79URzMZitI/AAAAAAAAAAU/FOogm4swfcU/s320/ocz-z-drive.jpg" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/1936864837167875331-1146957350763378677?l=farmhead.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://farmhead.blogspot.com/feeds/1146957350763378677/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://farmhead.blogspot.com/2010/04/ocz-pci-e-ssd-with-field-replaceable.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/1936864837167875331/posts/default/1146957350763378677'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/1936864837167875331/posts/default/1146957350763378677'/><link rel='alternate' type='text/html' href='http://farmhead.blogspot.com/2010/04/ocz-pci-e-ssd-with-field-replaceable.html' title='OCZ PCI-e SSD with field-replaceable MLC NAND'/><author><name>Farm</name><uri>http://www.blogger.com/profile/12178177852777050635</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><media:thumbnail xmlns:media='http://search.yahoo.com/mrss/' url='http://2.bp.blogspot.com/_RH_azLsKd_I/S79URzMZitI/AAAAAAAAAAU/FOogm4swfcU/s72-c/ocz-z-drive.jpg' height='72' width='72'/><thr:total>0</thr:total></entry></feed>
