Search this blog

Friday, April 9, 2010

Great post on HBase and Adobe

Cosmin Lehene wrote a great 2-part post about his team's experience with HBase. Here are links to part 1 and part 2. I hear one of his partners-in-crime, Andrei, is working on another interesting post on performance testing related to this work.

FastFlow Parallel Programming Framework

I’ve been looking into Intel’s thread building blocks during the early morning hours here in Bucharest (jet-lag) and ran across an interesting library that provides non-blocking, lock-free, wait-free, synchronization mechanisms.

Check out this tutorial page with small code snippets and some sample pipelines/farms:

http://calvados.di.unipi.it/dokuwiki/doku.php?id=ffnamespace:usermanual

Here are some background links:

http://en.wikipedia.org/wiki/Fastflow_%28Computer_Science%29
http://calvados.di.unipi.it/dokuwiki/doku.php?id=ffnamespace:about

From the fastflow page:

“FastFlow is a parallel programming framework for multi-core platforms based upon non-blocking lock-free/fence-free synchronization mechanisms. The framework is composed of a stack of layers that progressively abstracts out the programming of shared-memory parallel applications. The goal of the stack is twofold: to ease the development of applications and make them very fast and scalable. FastFlow is particularly targeted to the development of streaming applications.”

From wikipeida:

“Fastflow is implemented as a template library that offers a set of low-level mechanisms to support low-latency and high-bandwidth data flows in a network of threads running on a cache-coherent multi-core.[1] On these architectures, the key performance issues concern memory fences, which are required to keep the various caches coherent. Fastflow provides the programmer with two basic mechanisms: efficient communication channels and a memory allocator. Communication channels, as typical is in streaming applications, are unidirectional and asynchronous. They are implemented via lock-free (and memory fence-free) Multiple-Producer-Multiple-Consumer (MPMC) queues. The memory allocator is built on top of these queues, thus taking advantage of their efficiency.”

AMD 12-core opteron versus 6 core xeon

I'd like to have seen a larger set of tests thrown at this one, but you have to love all the auto-enthusiast references in this anandtech.com review of the new 12-core Opteron versus the newer 6-core Xeon.

That's two, 6-core Instanbul chips bolted together. Reminds me a bit of the Pentium D with a much larger cache coherency problem (imagine how much of a problem this is going to be as we keep adding cores to chips).

New WD Velociraptor VR200M

WD has released their next generation VelociRaptor (10K RPM, 2.5" disk). It has a new 6Gbps interface and 600 GB of space. There's an interesting review comparing this disk versus a couple of non-enterprise SSDs here.

SuperMicro 24 core motherboard

Speaking of 24 core motherboards with loads of RAM, I ran across this new SuperMicro motherboard the other day when doing some research. It's truly terrifying how many cores and RAM you can toss onto one box now.

Assuming one core is dedicated to a Dom0, you could have 23 VMs each with a dedicated core and over 8GB or RAM if you add all 192GB of RAM.

Here are some specs from the link above:
  1. Quad Intel® 64-bit Xeon® MP Support 1066 MHz FSB
  2. Intel® 7300 (Clarksboro) Chipset
  3. Up to 192GB DDR2 ECC FB-DIMM (Fully Buffered DIMM)
  4. Intel® 82575EB Dual-port Gigabit Ethernet Controller
  5. LSI 1068e Dual Channel 8-Port SAS Controller
  6. 6x SATA (3 Gbps) Ports via ESB2 Controller
  7. 1 (x8) PCI-e (using X16 slot), 1 (x8) PCI-e (using x8 slot) & 1 (x4) PCI-e (using x8 slot) 1x 64-bit 133MHz PCI-X
  8. ATI ES1000 Graphics with 32MB video memory
  9. IPMI 2.0 (SIMSO) Slot 


OCZ PCI-e SSD with field-replaceable MLC NAND

OCZ is ready to mass produce it’s PCI-e SSDs with field replaceable MLC NAND flash modules.

This makes the MLC versus SLC debate a bit moot if you can just replace the NAND when it wears out like a bad disk. Did I mention that it has 8 separate Indlinx controllers, up to 2TBs of space, and has peak transfer rates of 1.4GB/s for reads and writes (that’s gigabytes not gigabits)? I can’t imagine what will happen with a Sandforce controller version of one of these monsters.

This is some seriously interesting temporary storage for a virtualization cluster that needs some fast DAS. With 2 TB, you could carve up 87 gigabytes for 23 VMs on a 24-core virtualization box. That’s mighty interesting.