To save your data in RedShift you first need to download the pg module because in fact RedShift is a Postgres database.
Let's create our table in RedShift with the following sql command:
npm install pgIf you have not already done so, install the amazon sdk with the following command:
npm install aws-sdkWe can now write our first lines of code to import these modules into our file:
var AWS = require('aws-sdk'); const { Pool, Client } = require('pg');Creating a connection pool with the database:
let pool = new Pool({ user: user, host: 'your-endpoint.redshift.amazonaws.com', database: database-name, password: 'your-password', port: 5439, });You need to create a table in the RedShift public schema with a column for each property in your json file. Your file must be in RJSON (relaxed json) format, that is, without commas between your objects.
Let's create our table in RedShift with the following sql command:
CREATE TABLE public.product ( title varchar(256) NOT NULL, brand varchar(256) NOT NULL, productid int4 NOT NULL, salesamount float8 NOT NULL );We can use RedShift to copy data from a file in RJSON format directly from S3 into a RedShift table, like this:
let s3Path = 's3://bucket/file.json'; let targetTable = 'nome-da-tabela-no-redshift'; let region = 'aws-region'; let role = 'arn:aws:iam::ACCOUNTID:role/AWS-ROLE-WITH-ACCESS-TO-REDSHIFT'; let json_mode = 'auto';The COPY command is used to copy data from a file to a table. In our case, we are copying data from a file that is in a S3 bucket.
let copyStatement = [ 'COPY ' + targetTable, 'FROM ' + quote(s3Path), 'region ' + quote(region), 'iam_role ' + quote(role), 'json ' + quote(json_mode) ].join('\n');Running the query/command in the database:
pool.query(copyStatement, (err, res) => { if (err == undefined) { console.log('Data inserted successfully into database'); } else { console.log(err); } pool.end() }); function quote (x) { return "'" + x + "'" }You can download the file with the complete source code from GitHub.
Comments
Post a Comment