If you’ve ever worked with Hadoop then you know about the many related tools & platforms. Hive is one such tool that lets you to query and analyze data through Hadoop.
It seems like a complicated program but with the right learning materials it’s easy to pick up Hive from scratch.
I’ve organized the absolute best Hive books to take you from a complete novice to an expert user. There should be at least one book here for everyone regardless of your experience level.
But if you’ve never used Hadoop before you should pick that up first. Check out our best Hadoop books collection if you need some ideas to get started.
There aren’t many Hive books out there but my #1 pick has to be Programming Hive by O’Reilly. It’s perfect for complete beginners and even intermediate-level users. It also has an updated 2nd edition covering the latest version of Hive from start to finish.
It’s tough finding up-to-date books since not all publishers bother to release new editions. But with Apache Hive Essentials you know you’re getting the freshest information possible.
This book is pretty short with only 145 pages. Yet in those pages you’ll learn everything about the Hive workflow from beginning to end.
Each chapter covers some theory along with real-world practical examples. These include Hive basics over big data storage on a Hadoop system.
The book’s index is easy to skim and the chapters are organized in a natural progression. A newbie could pick this up and follow along with ease. But the contents are also relevant for intermediate-level users.
Bottom line this book will teach you Hive and teach it right.
O’Reilly released a powerful 2nd edition to their Programming Hive book and it’s the most compelling title yet. This is also the newest Hive book on the market so you know it’s up to date with the latest version.
Early chapters start by teaching you the basics of Hive, how to setup & configure a new install, along with the basic Hadoop ecosystem. This includes info about HiveQL which is Hive’s SQL-style syntax.
Later chapters delve into the Hadoop workflow explaining where Hive fits into the whole thing. You’ll also learn about MapReduce which is a huge part of working with Hadoop.
Programming Hive is a much more professional book. It spans 400 pages and the second edition has been updated with all the newest features.
This is a solid book for beginners although maybe a bit technical. But it’s still the most detailed and up-to-date title covering Hive so it makes for a great intro book and a handy reference guide.
If you’re on a tight budget and just want a quick intro then check out Learn Hive in 1 Day. It’s only 79 pages long and it only comes in digital download.
But it’s also the most concise book on Hive that you’ll find anywhere. I actually think this book beats most online tutorials because it’s organized for beginners.
You’ll start by learning what Hive is and how it works on Hadoop. Then you’ll get into the basics of installing Hive and connecting to a database. Later chapters get into more detailed topics about searching and scaling your applications.
The amount of information packed into this little book is astounding.
And even though it’s a discount book I still think the quality of writing and depth of content is superb.
The Apache Hive system is most notably referenced as a database search and management console. With Apache Hive Essentials How-to you’ll dive right into this system along with the HiveQL language.
Each chapter covers a variety of common operations along with step-by-step tutorials. You’ll learn actionable tips for developing on Hive, looking into different storage formats, and digging into more advanced features like UDFs.
However this guide doesn’t cover everything and it skips over a lot of the Hive administration topics. The book is only 76 pages so it’s not meant to be a complete guide.
It certainly does offer a reasonable starting point for new users who want to get up & running with Hive as quickly as possible.
Practical Hive is another newly published book covering the basics of Hive for Hadoop environments. You’ll learn how to install and configure Hive for datasets along with the basics of searching with HiveQL.
But the goal of this book is practicality with live examples and case studies.
Each chapter has a series of exercises including the solutions with plenty of screenshots. You do not need any prior Hive experience but you should understand SQL and some basics of database management.
Practical Hive can cover all the basics along with more complex functionality like MapReduce and DDL/DML operations. Later chapters even go over some workflow tips to improve server performance and to optimize your Hive setup.
This is yet another incredible book aimed towards beginner-to-intermediate users who want to master the basics of Hive through practice work.
There aren’t too many situations that you wouldn’t be able to solve with some Googling. But the Apache Hive Cookbook will save you a lot of time and headaches trying to scale your applications on Hive.
The book covers 268 pages with dozens of recipes for common and not-so-common Hive scenarios.
You’ll learn about the Hive Data Model, partitions, debugging, security, and working with other frameworks like Apache Spark.
These recipes are simple enough that even a beginner could follow along. But it is beneficial to have some experience with Hive and Hadoop since you can work through the recipes a lot easier.
If you’re a complete beginner struggling to learn hive then give Programming Hive a shot. It’s a bit lengthy but it also goes into great detail for every aspect of Hive: setup, maintenance, security and customization.
If you want something cheaper and shorter I’d say Learn Hive in 1 Day is the next best option. It’ll only cover the basics, but sometimes that’s all you need.
Once you’re building with Hive and need some practical big data solutions the Apache Hive Cookbook will become your favorite resource.
Hive isn’t as confusing as it seems but getting started can be tough. Thankfully any book from this list can help you get up & moving with Hive and Hadoop.