Tuesday, March 6, 2018

How to add .csv file data to MySQL through Java

First what is CSV meaning? 


It is abbreviation for Comma Separated Values. 


What is it?


CSV is a file format which allows you to save data as a table structured format.  This file format actually looks like a text file.


Why do we need to use a .csv file?


When you want to store data as a table structured format easily. This format is used in many technologies and for many products. You can learn more about in this blog post:  CSV - What, Why and How.  And heavily used to migrate data from one place to another. As a example in programming world, data migrate from MS SQLserver 2008 to another server. Of course we have better alternatives for this, XML(Extensible Markup Language) and JSON(Javascript Object Notation) are better examples for this. This format is somewhat not a better idea to use with complex data or with hard specification(This is defer from user to user. For me it is but for some people they are using some mechanism to handle this situation). Read more about this from softwareengineering.stackexchange.com post.


We can create and add data through Microsoft Excel, Google Sheets and many more. But this post not about that but about how to add .csv file data to MySQL through Java program,


Here are the steps:


1. Create a database, table inside it and add fields that we want.

As a example I am going to create a database called Songs and table called Rap. This table contains four fields. Those are id, song_name, artist, relesead_year. MySQL Query as follows:

For database,

CREATE DATABASE Songs

To create table,

CREATE TABLE Rap(
  id INT,
  song_name VARCHAR(150),
  artist VARCHAR(60),
  relesead_year DATE
)

I did not add primary key and other things for the sake of simplicity of this post.

2. Create Java project with main class(I am using Eclipse IDE for this but you can use any IDE or notepad. steps are same)

If you are using eclipse go to File -> New -> Java Project. In the pop up window you can find a place to give a name to project. I have named it as AudioLibrary.

Next, right click on the project and go to New -> Class. In this window add the class name as Songs(I know this is not a proper name for this class. You can add a proper name) and package name as com.slcoder.audiolibrary and also make sure to tick the main class checkbox. As shown in the below image,


3. Connect to MySQL database.

Now, we must connect our Java project with MySQL database, otherwise we can not do CRUD(Create, Update and Delete). To connect we need a connector, a MySQL connector JAR(Java Archive). To download go to dev.mysql.com.

Next, right click on the project and create a folder called libs and copy paste the jar file that you have downloaded. Then, expand the libs folder and right click on the jar file and go to Build Path -> Add to buid path.


To ensure that this jar file successfully added to build path, expand and have a look at the Referenced Libraries. JAR file should link to that folder.

If you did not add the jar file to build path, you will get a error message like,

No suitable driver found for jdbc:mysql://localhost/Songs

Now we can add java code to the Songs class main method to connect Java with MySQL database. Firstly, we need to 

Connection con = DriverManager.getConnection("jdbc:mysql://localhost/Songs", "root", "1234");

Keep remember to add the correct package for Connection because there are two packages, one is com.sql and other one is com.jdbc both are work fine with this example but it is a good practice to use com.sql. The reason is when you need to use other database driver, you do not have to change your java code(Portability).

By the way, what is the database driver?

JDBC drivers are software libraries(JAR files) that communicate between a Java application and a database. For each and every relational database there are a specific JDBC driver.

Connection is a class actualy it is a interface. con is the object name. getConnection() is a method in the DriverManager class. And getConnection() takes three arguments.

  - JDBC connection URL
   
      It has a format and that format should be like above. Inside the "jdbc:mysql://" is same for every driver. localhost is the DNS of the host server, you can also add IP address(Usually 127.0.0.1) instead of localhost.  They are JDBC(Java Database Connectivity) connection URL, Database user's username, and the password. The last part of the connection url or the first argument is the database name /Songs

  - username
      Second arguement is the username of the user in your database. The default name is root if you have not changed it.
  - Password
      And the third argument is the password of that user. By default there is no password for root user. Of course you can change it or add another user.

What is JDBC(Java Database Connectivity)?

It is a API(Application Programmable Interface) for connecting to Relational Database.


DriverManager

The DriverManager provides a basic service for managing a set of JDBC drivers. Read more about this class in the doc.


getConnection()


When this method get called DriverManager will attempt to locate a suitable driver. And the responsible of this method is to establish the connection to given url.

4. Create a .csv file for our needs and add some data to it.(If you want to learn how to add data to .csv file through MySQL using java)

Right click on the project or inside the src directory of your project and create a file and give it a name as songsdata.csv. and add some line to that file. When you do that kee remember to separate fields by , and these fields should match with the table fields/columns. As well as number of columns and data type of column.

I am gonna add three lines according to our database table columns as follows,

1,Mocking Bird,Eminem,2004-00-00
2,Dear Mama,2Pac,1995-00-00
3,My Life,The Game,2008-00-00

If there is two many comma separated values, when running your application you will end up with the

Row 1 was truncated; it contained more data than there were input columns 

If there is lesser values

Row 1 doesn't contain data for all columns

If data types does not match with the table values or if your .csv file is something like- 'a', 'b', '', 'd' MySQL does not know what to do with the '' value. So, you will get a error message something like this.

Data truncation: Incorrect date value: 'someValue' for column 'aColumn' at row 1

In case if you have not value to add for a field, you can use NULL like 'a', 'b', 'NULL', 'd'.

5. Write mysql query to add .csv file data into mysql table.

String sql =  "load data local infile 'songsdata.csv' \n" +
           "   replace \n" +
       "   into table Rap \n" +

       "   columns terminated by ',' ";

String sql is the variable name and the value, that variable holds is MySQL query.

 "load data local infile  is used to load data from a local file. that file is is the 'songsdata.csv'  wrapped by single comma and \n use to add new line

+ sign is added because I have devided the String into four lines to take readability in my example. Of course you do like below. 

String sql =  "load data local infile 'songsdata.csv' \n replace \n into table Rap columns terminated by ',' ";

replace \n will remove all the new lines from .csv file. If you refer the file, you can see we have added as three lines. So from sql query we are remove that line separation.

into table Rap columns terminated by ',' storing into the our table Rap and we have to specify what is the delimiter we are using to seperate fields. In this example it is ',' and you can use pipe-delimiter '|' or tab '\t'. 


1|Mocking Bird|Eminem|2004-00-00

or

1 Mocking Bird Eminem 2004-00-00

Above line should use tab key not space. But if you do not change the csv comma separator to what ever delimiter you are using, you will get a error message like

Data truncated for column 'id' at row 1

Now you need two more line. One is

Statement stmt = con.createStatement();

Statement is also a interface and it's responsibility to executing a static SQL statement from the object that provides(in this example con) and returning the results(sql statement that created) it produces to the stmt. Read the doc.

Second one is,

stmt.execute(sql);

Above line use to execute the query. Statement contains a method called execute that takes one argument. Argument is the MySQL query. We can direcly add as a String or as I shown in the example assign into a variable and pass it to execute(). And returns true if the first object that the query returns is a ResultSet object.


6. Run the project and explore the database table.





What are the best practices?
  1. Of course we did not close the database connection.
  2. We can use quotes in our .csv file to avoid unexpected errors or data.
What are the other things, feel free to suggest me. Thank you for reading and I hope you learned something.

No comments:

Post a Comment