Apache is known for its open source ideology with dozens of incredible tools for developers and system administrators. Hbase is one such tool built on top of Java and powered by the Hadoop project.
To learn Hbase you’ll need fundamental skills in Java and some comfort with database models. But you can learn all of this from small online guides along with more detailed books covering Hbase features.
In this post I’ll share the 10 best Hbase books for getting started and mastering HBase from a practical perspective. If you put in the work and pick up the right learning resources you can master Hbase and easily fit this powerful database model into your workflow.
If you’re a complete beginner with no prior knowledge of Hbase I would recommend starting with HBase: The Definitive Guide. It’s quite lengthy but full of great information. The author is even a core contributor to the Hbase project.
You should have some knowledge of Java/Hadoop before even starting with Hbase. But even if you have no experience with NoSQL you can still pick up this book and learn a real Hbase workflow.
Software architect Nishant Garg authored HBase Essentials as an introductory guide to Hbase for beginners. Garg has over a decade of experience building in Java and is one of the best instructors you could hope to find writing a book of this caliber.
You do not need any prior knowledge of Hbase or the database system. However it does help if you know a bit about Java’s HDFS.
This is a very simple book to pick up and dive right into Hbase with exercises covering the basic setup and configuration of a new Hbase database.
Garg explains the differences between Hbase and other common database engines to give you an overview of how data is stored. This also includes CRUD operations and more technical jobs like MapReduce.
This book feels like it gets very technical very quickly. It is made for beginners but it expects a lot of practice from anyone working through the exercises. The author has 10+ years of experience to share and the exercises in this book are tailor made for beginners.
Large programming books can be intimidating but HBase: The Definitive Guide is one of the largest and most useful books for newbies. The author Lars George has been working with Hbase for almost 10 years. He knows this database system like the back of his hand which quickly becomes obvious as you read his writing style.
The book totals 550 pages full of exercises and theoretical examples of databases running Hbase. Lars explains the differences between other non-relational database engines so you can understand why Hbase is such a powerhouse.
Each chapter includes diagrams with code snippets to better explain each feature in the stack. But Lars doesn’t force you into a specific procedure with Hbase. Instead he talks about the pros and cons for each task and compares the workflow to other relational databases. You’ll learn about common pitfalls with Hbase and how to avoid them in live projects.
Lars is one of the best instructors to write this book. It’s evident in the foreword and while reading through each chapter.
This is not strictly an exercise book, but it works well as a combo of theory and practice for beginners. If you have the stomach to pick up a 550+ page book on Hbase this can prove incredibly valuable to anyone regardless of skill level.
If you’re looking for a practical guide to Hbase for application development then HBase in Action is the perfect choice. The book covers everything from initial setup to the fundamentals of distributed systems. You’ll move through lessons quickly and each chapter builds upon the last.
Hbase isn’t as complicated as it seems. But because it relies on many other technologies it can be off-putting. The NoSQL method is challenging but also incredibly rewarding for high-traffic applications. You can store billions of rows without quality loss and still optimize for performance.
Early chapters teach the basics of Hbase and why it’s so valuable for large scale applications. You’ll learn how to build real projects for user-powered databases and GIS tables with each exercise offering practical code snippets along with Hbase theory & best practices.
Over 360 pages you can go from a fairly inept Hbase user to an advanced developer/sysadmin. However I don’t recommend this book solely for beginners since there’s so much to learn.
I do recommend that you have a little knowledge of Hbase and Java before picking up this book, and ideally some knowledge of NoSQL too. But if you’re feeling ballsy you can go to town and hack your way through each lesson by learning as you go.
I actually find this book much friendlier to database admins and systems administrators who work with Hbase on the server side. Yes it’s also a great book for developers. But many of the lessons teach Hbase integration with Hadoop, Pig, and similar tools.
Learning HBase is still very much a beginner’s book spanning just over 300 pages. You’ll learn all about clusters and how to setup, configure, and debug Hbase demo clusters. Each exercise builds on previous skills and the early chapters assume little-to-no prior experience.
The reason I say this book targets IT/technical users is that it focuses mostly on the Hbase system rather than application development. You’ll learn how Hbase stores data and how you can optimize retrieval of data. There’s plenty of Java to go around but many of these snippets aim to improve performance with Hbase.
Still an excellent book for anyone who wants to get into the server side of an Hbase ecosystem.
If you’ve read some of the O’Reilly books then you’ll know about their “definitive guide” series. Earlier in this post I mentioned HBase: The Definitive Guide which talks about the entire Hbase system from start to finish.
Well their related book Hadoop: The Definitive Guide goes into the Hadoop framework which is powered by Java and runs in conjunction with Hbase. This Hadoop book totals 700+ pages which is even larger than the Hbase one. It is the best introduction for programmers who know Java and want to know Hadoop.
You do not need to become an expert in Hadoop to run Hbase. But it helps to learn about this framework and use it to its fullest potential. This book covers everything from scalable distributed systems to 3rd party applications for common projects.
The author Tom White is a voice worth reading in the realm of Hadoop. I find his writing style a bit technical so this book may be difficult to follow if you’re not familiar with deep programming. But I still really enjoy the exercises and this book can take your Hbase workflow to a whole new level.
Here’s another more advanced Hadoop book meant for developers pushing beyond that extra mile. Hadoop Application Architectures offers best practices and common sense problem solving for Hadoop environments.
Before picking up this book you will need to be comfortable building custom Hadoop projects on your own. This includes some knowledge of Hbase and other related tools like Pig and Hive.
This book is a decent size with 400 pages and dozens of exercises covering different Hadoop features. You’ll find many Hbase exercises that cover the NoSQL architecture and optimization techniques for key-value store applications.
More detailed Hbase concepts get into column families and row-key naming schemes. The authors walk you through a variety of programs that teach you how to build on top of Hbase with Hadoop architecture.
You do not need to master Hadoop for Hbase development. However if you’re serious about a career in distributed DBs then it’s worth picking up this book once you’re beyond the basics in Hadoop programming.
Every application is only as good as its structure. Learning Hbase is one thing, but organizing an application to scale is a whole different ballgame.
Architecting HBase Applications covers 250 pages worth of sample projects to guide you through best practices for Hbase application development. You’ll learn how to design and deploy Hbase apps with cluster deployment pitfalls & solutions. The authors even include case studies for Hbase projects so you can learn from the experience of other professionals.
Each project covers a different concept from universal healthcare systems to digital advertising. Existing knowledge in Java is a must before picking up this book. However the early chapters offer a brief overview of Hbase and the environment of NoSQL data storage.
I would recommend this only to devs & sysadmins who already have experience building on Hbase. This book will specifically help programmers who want to build applications on Hbase with best practices in tact.
The book can seem a tad short, but what you get in this writing style & organization isn’t something you can find lying around on the Internet.
Database architects spend years practicing and studying the best methods for building scalable data. HBase Design Patterns teaches all the best database techniques using Hbase/Hadoop as the teaching tool.
You should already have experience building on NoSQL before getting this book. It dives right into the action teaching about key generation and Hadoop clusters/configurations for DB clusters. You can apply many of these techniques to other related DBs like Cassandra. However I would only recommend this book for folks interested in Hbase.
The book is very short at only 144 pages. It does cover Hbase and general NoSQL design patterns in great detail. But I also expected a lot more for a book covering advanced Hbase application development.
Still this book can be valuable for intermediate-to-advanced developers who need to master Hbase data storage structures. The lessons can feel verbose and rather challenging at times. But if you keep practicing and work through the confusing areas you’ll come away from this book with a much deeper understanding of NoSQL optimization.
I know this isn’t a typical Hbase book and it might not belong in this list. However the actual writing style and presentation is incredible, and I would highly recommend this to anyone aspiring towards a career in databases/server administration.
Seven Databases in Seven Weeks offers a crash course into the seven most popular NoSQL database engines. Over 350 pages you’ll learn about the different techniques and methods for modern NoSQL storage.
The authors cover CouchDB, MongoDB, HBase, Redis, Neo4J, Riak and PostgreSQL. This book compares all of these engines to show how they fit into a typical dev workflow and how you can organize your applications with the best tool(s) for the job.
I would recommend this to anyone getting into NoSQL who may want to follow a career in IT. Hbase is featured prominently in the book, but you’ll also learn about similar options that can work well in certain dev stacks.
IT/tech enthusiasts and NoSQL fanatics should absolutely read this book.
Managing an entire Hadoop/Hbase setup can be difficult if you’re new to clustering and large scale applications. But the HBase Administration Cookbook by Yifeng Jiang can make your job a whole lot easier.
This book is over 300 pages long offering dozens of unique recipes for solving common Hbase problems. You’ll learn workarounds with Hive and the Hbase shell to access features remotely. You’ll also learn how to configure distributed Hbase clusters and how to implement load balancers over multiple databases.
Data replication, server monitoring, and visual reporting are all included in this cookbook. You get so many practical recipes that you’ll never want to put this down!
I highly recommend this book if you have experience working with Hbase and Hadoop. You do not need to be an expert but you will need to know how to install & configure a basic Hbase setup.
Since this book is so detailed it works well for intermediate & advanced sysadmins/developers who want a reference guide or toolkit full of Hbase solutions. This is super dense and it’s one of the best advanced books covering Hbase in practice.
Everyone starts from a different place with different skills and experience. Hbase is a weird tool because it requires some Java experience but very little prior DB experience.
There are so many great books out there but these 10 are really the best to achieve confidence with Hbase. Take another look over the list and if any titles catch your attention be sure to check them out.