This article was co-authored by How.com.vn staff writer, Katherine Pryszlak. Katherine Pryszlak has worked in the tech industry for decades, focusing on UI/UX, accessibility, and building systems that prioritize users from the ground up. With an academic background in English, she understands the importance of bringing technology to as wide an audience as possible with clear, concise communication, and loves working with the How.com.vn community to achieve that goal.
This article has been fact-checked, ensuring the accuracy of any cited facts and confirming the authority of its sources.
This article has been viewed 1,756 times.
If you’re using Apache Hadoop, you know what a powerful tool it is for storing and processing massive amounts of data – but accessing all those petabytes can be difficult. Apache Hive allows you to more easily query, extract, and analyze your Hadoop data using SQL-like commands. Creating a CSV file from a data table is just one way to pull the important information you need into a usable format, however exporting the file from Hive will often leave you without a header due to the difference in formatting. This How.com.vn article will show you how to maintain the column headers using Hive and the Beeline command line interface when you export to CSV.
Steps
- Update your software and server. If you haven’t updated in a while, you may be running deprecated versions of the HiveServer. HiveServer2 has its own CLI (command line interface) called Beeline which replaces the original Hive CLI and allows for more flexibility when accessing your data. You will also require:
- Java 1.7 or newer
- Hadoop 2.x
- Run HiveServer2. In your computer’s terminal, enter $HIVE_HOME/bin/hiveserver2.
- $HIVE_HOME is the directory in which Hive is stored.
Advertisement - Run Beeline. In your terminal, enter $HIVE_HOME/bin/beeline -u connect jdbc:hive2://LOCALHOST:PORT USERNAME PASSWORD.
- LOCALHOST is the IP address where the HiveServer2 was started.
- PORT defaults to 10000.
- USERNAME and PASSWORD are the credentials you used when setting up Hive.
- Enter SHOW DATABASES into your terminal. This will show you a list of your current databases and their filenames.
- Export your file. With the name of the database you want to export represented by DATABASE, enter the following line of code. It will create a file in the HIVE_HOME folder in CSV format, complete with headers!
$HIVE_HOME/bin/beeline -u jdbc:hive2://localhost:10000 -n USERNAME -p PASSWORD --outputformat=csv2 -e "SELECT * FROM FILENAME." > export.csv
Advertisement
Expert Q&A
Tip
- The now deprecated HiveServer1 used the Hive CLI to interact with Hadoop, but with the release of HiveServer2 and Beeline, this will soon be phased out. Newer versions will not support the Hive CLI commands.
Is this article up to date?
⚠️ Disclaimer:
Content from Wiki How English language website. Text is available under the Creative Commons Attribution-Share Alike License; additional terms may apply.
Wiki How does not encourage the violation of any laws, and cannot be responsible for any violations of such laws, should you link to this domain, or use, reproduce, or republish the information contained herein.
- - A few of these subjects are frequently censored by educational, governmental, corporate, parental and other filtering schemes.
- - Some articles may contain names, images, artworks or descriptions of events that some cultures restrict access to
- - Please note: Wiki How does not give you opinion about the law, or advice about medical. If you need specific advice (for example, medical, legal, financial or risk management), please seek a professional who is licensed or knowledgeable in that area.
- - Readers should not judge the importance of topics based on their coverage on Wiki How, nor think a topic is important just because it is the subject of a Wiki article.