So… you suppose all of your knowledge in a specific discipline are a string kind, however if you attempt to run your question, you get some errors. Doing extra investigation, it seems like you might have some int and undefined sorts as nicely. Bummer…
Despair not! We will really work round this (with out knowledge prep 😉). To recap, in our first weblog, we created an integration with MongoDB on Rockset, so Rockset can learn and [update] the information coming in MongoDB. As soon as the information is in Rockset, we will run SQL on schemaless and unstructured knowledge.
Embedded content material: https://gist.github.com/nfarah86/ef1cc9da88e56226c4c46fd0e3c8e16e
We have an interest within the
Rockset has a perform known as DATE_PARSE(), which lets you flip a string formatted date right into a date object. If you wish to order motion pictures by simply the 12 months, you need to use EXTRACT().
Basically, when you flip your string formatted date right into a date object, you’ll be able to then extract the 12 months.
At first look, this appears fairly straightforward to resolve— if you happen to needed to order all of the film titles by the discharge 12 months, you’ll be able to write one thing like this:
SELECT t.title, t.release_date FROM commons.TwtichMovies t ORDER BY EXTRACT( YEAR FROM DATE_PARSE(t.release_date, '%Y-%m-%d') ) DESC ;
When working this question, we get a timestamp parsing error:
Timestamp parse error:
This might imply you’re working with different knowledge sorts that aren’t strings. To examine, you’ll be able to write one thing like this:
SELECT t.title, TYPEOF(t.release_date) FROM commons.TwtichMovies t WHERE TYPEOF(t.release_date) != 'string' ;
That is what we get again:
Now, that we all know what’s inflicting the error, we will re-write the question to discard something that’s not a string kind— proper 🤗?
SELECT t.title, t.release_date FROM commons.TwtichMovies t WHERE TYPEOF(t.release_date) = 'string'; ORDER BY EXTRACT( YEAR FROM DATE_PARSE(t.release_date, '%Y-%m-%d') )DESC ;
WRONG 🥺! This really returns a timestamp parsing error as nicely:
Timestamp parse error
You are most likely saying to your self, “what the heck.” One case we didn’t take into accounts earlier is that there could possibly be empty strings 🤯- If we run the next question:
SELECT DATE_PARSE('', '%Y-%m-%d');
We get the identical timestamp parsing error again:
Timestamp parse error
How can we really write this question to keep away from the timestamp parsing errors? Right here, we will really verify the LENGTH() of the string and filter out the whole lot that doesn’t meet the size requirement— so one thing like this:
WHERE LENGTH(t.release_date) = 10
We will additionally TRY_CAST()
t.release_date to a string. If the sphere worth can’t be become a string, a
null worth is returned (i.e. it gained’t error out). Placing this all collectively, we will technically write one thing like this:
SELECT t.title, t.release_date FROM commons.TwtichMovies t WHERE TRY_CAST(t.release_date AS string) shouldn't be null AND LENGTH(TRY_CAST(t.release_date AS string)) = 10 ORDER BY EXTRACT( YEAR FROM DATE_PARSE(t.release_date, '%Y-%m-%d') ) ;
Voila! it really works!
In the course of the stream, I really wrote a extra sophisticated model of this question. The above question and the question within the stream are equal. We additionally wrote queries that mixture! You’ll be able to catch the total breakdown of the session beneath:
Embedded content material: https://youtu.be/PGpEsg7Qw7A
TLDR: yow will discover all of the sources it is advisable to get began on Rockset within the developer nook.