How to pull data from SaaS software, enterprise apps behind a firewall and SQL databases onto every device you own.
Integrations | Feb 3
Boris Sagadin on September 27, 2016 • 6 minute read
To start off, here’s what Amazon is saying about Redshift:
Amazon Redshift is a fast, fully managed, petabyte-scale data warehouse solution that makes it simple and cost-effective to efficiently analyze all your data using your existing business intelligence tools.
If you’d like to learn more about the tool itself, please read this. Amazon Redshift takes its roots from a very popular open source database, PostgreSQL 8.0.2. It has a strong focus on scalability. Common setup is a clustered environment with a leader node. It follows a MPP (Massively Parallel Processing) architecture, which means that all operations are executed with as much parallelism as possible.
At the moment, MySQL, PostgreSQL, Microsoft Azure SQL and Amazon Redshift are supported out of the box on Databox. You have the data ready in your database, now it just needs to get visualized in an easy and concise manner so everyone – even your boss – can use.
Let’s get started!
What Will We Accomplish in This Tutorial?
Firstly, we’ll setup a new AWS Redshift cluster from scratch, then we’ll connect it to Databox and confirm that the connection is working. Lastly, we will create a Datacard visualizing the data from the cluster. All this without a single line of code, except for the SQL query.
In this section, we will create a new AWS Redshift cluster step-by-step, add a user and setup network rules to allow access from our IP (
Login to AWS Console, then visit Services / Redshift and click on ‘Launch Cluster’ and fill-in the cluster details:
Fill the form as needed; defaults are fine in this example. Pick a secure password.
Now choose the Node Type; for this example, we’ll use the weakest one:
Single Node Cluster Type.
On the next page, your screen settings will also depend on your network setup; most of the defaults are fine for this example.
Review your settings and click on ‘Launch Cluster.’ Your cluster will take some time to build. When it’s ready, click on ‘Cluster Name’ and the cluster overview will be shown. Hostname to connect to will also be visible from Endpoint string:
In our example, hostname is
Your server should now be successfully set up to accept requests from our IP (
22.214.171.124) to your Amazon Redshift cluster database, using your chosen user name and password. Go ahead and load some sample data and it’s ready for connecting to Databox.
The database cluster is now ready! The next step is to connect it and test it’s returning the data we need for our visualizations:
Great! You have just successfully connected your database to Databox. In the next step, we’ll write a custom query that will regularly fetch data from your database and make it available for use in any Datacard.
Troubleshooting: If you get a “wrong credentials” message, double-check your user data. If you’re stuck on ‘Activate’ for a minute or so, it’s probably having issues connecting to your database host due to firewall / server / networking issues.
Now that the database is connected, we will use the Designer to query, shape and display the data in a format that’s most appropriate and useful for our needs:
SELECT COUNT(p.ID) AS posts, u.display_name FROM users u, posts p WHERE p.post_author=u.ID AND p.post_type='post' GROUP BY u.ID
We have just written a custom SQL query and displayed its results. Databox will continuously, each hour, fetch data from this resource and store it in the selected target data source (in our example ‘My AWS Redshift’).
Note: we’ve used the
AS SQL construct in our query (i.e.
AS posts). That’s not mandatory, but it will describe the data as a metric key. If you use
AS date this column will represent then the date and time of the value (ISO 8601 date and time standard is supported). You can leave it as it is, or rename the result column instead. Semicolon at the end of the query is not needed either.
Troubleshooting: If you don’t see any data, double-check your SQL query, try it directly on your database. If it’s not displaying results there, you have an error somewhere in your query. Also check that the AWS Redshift user has necessary permissions to access the database from Databox IP.
Well done! Your AWS Redshift database is now connected to Databox, queries can be executed and then displayed on your mobile / big screen / computer.
Go ahead and explore further. Add more queries, add blocks, explore different types of visualizations. Make that perfect Datacard (or Datawall of course) you always needed but didn’t know how to get. Now you can! Clean and professional, right at your fingertips. Only data that matters, without clutter. The possibilities are truly endless.
Ready to try it for yourself? Signup for free today and let us know how it went for you.
Remember: we’re always glad to help if you run into any obstacles!
Integrations | Feb 3
Integrations | Jan 19