Big Data, big challenge. In this session we’ll look at how companies are building high-performance systems manage, access, analyze, search, and share massive datasets that drive Web-Scale applications and other data intensive apps. We’ll compare relational vs. non-relational approaches and look at how these different paths impact the architecture of the system.
RDMS/SQL, NoSQL, Key Value stores, Wide Columns, Eventual Consistency, Massive Parallelism. We’re bringing together a great panel of speakers to discuss and share knowledge about the technology and techniques being used manage the ever increasing amount of data.
Join us for a great discussion and a great networking event!
Follow @geekSessions on Twitter for ongoing announcements, FREE TICKETS, Speaker Conversations and more!
What: geekSessions 2.1: Data Scalability –SQL or NoSQL?
When: Tuesday, May 3rd, 6:00pm-10pm
- 6:00pm – Networking, Snacks, Free Beer
- 7:00pm – Panel discussion and Q&A (bar closed)
- 8:30pm – More socializing
About the Speakers
Jason Lucas is the scalability architect for Tagged (www.tagged.com), the third largest social networking system in the world. Jason has worked for Google on large-scale, distributed systems and for Microsoft on the Visual C++ compiler. He also spent almost ten years working on artificial intelligence systems for treating HIV/AIDS in Africa. These days Jason focuses on problems in the NoSQL space, creating planetary-scale data services that are reliable, fast, cheap, and, if at all possible, easy to use.
Researcher, Machine Learning Department CMU
[click here for Danny’s slides]
Danny Bickson is a postdoctoral researcher at the Machine Learning Department in Carnegie Mellon University, hosted by Prof. Carlos Guestrin (CMU) and Prof. Joseph Hellerstein (Berkeley). His most recent project, GraphLab, involves the design and implementation of a distributed programming abstraction that outperforms MapReduce, designed to support iterative and potentially asynchronous algorithms on big data. His research targets large scale distributed algorithms design and their deployment, spanning both the theoretical and applied aspects of large scale computing and applied machine learning.
Ted Dziuba was the co-founder and lead engineer behind Milo.com, an online comparison shopping engine. Milo was acquired by eBay in December, 2010, and Ted is now Senior Member of Technical Staff for eBay’s Local division. Previously, he worked at Google on internal tools and Pressflip, a machine learning startup. Today, he works with both SQL and NoSQL systems, from hardware and operational aspects to application development.
[click here for Eric’s slides]
Eric Bieschke runs playlist engineering for Pandora. As Pandora’s second employee he built small scale prototypes for many of Pandora’s systems and has grown them to service more than 80M users who’ve thumbed 8 billion songs while listening to billions of hours of music. Pandora has taken a hybrid SQL/NoSQL approach to data scaling with an architecture that leverages everything from Hadoop to Postgres to Redis and everything in between.
About the Moderator
Mike works at SimpleGeo, a company that provides a hosted spatial database. His primary responsibility is obsessing over the scalable storage infrastructure built on top of Apache Cassandra. He spends his time making data structures work in a distributed eventually consistent system, routing around failures, and making bad jokes about concurrency. Before SimpleGeo, Mike worked at Flickr, where data about some 6 billion photos is stored in one of the largest MySQL installs. SQL or not, he loves large storage architectures that can handle terabytes of data.