SQL NotesOracle FlavorThese are my personal notes that I use as a quick help in my work.
|
|
Relational databases have four properties that guanrantee reliable processing of transactions: atomicity, consistency, isolation, and durability (ACID, see wikipedia).
Try doing MINUS/EXCEPT as a select of rows that don't exist in the optional table of an outer join. "where not..." is also possible but probably much less efficient.
Database specific commands for granting and revoking privileges, and other administrative tasks.
asdf
RENAME table_name TO new_name; -- Rename a table, but cannot rename a column
DQL (Data Query Language) is basically the querying part of DML.
Select
SELECT t1.col1 as a1
, t2.col2 as a2
FROM table1 t1
, table2 t2
WHERE t1.id = t2.id
UNION [ALL] / MINUS / INTERSECT
SELECT ...
ORDER BY 1, 2;
SELECT col1
, SUM ( col2 )
FROM table1 t1
WHERE ...
GROUP BY col1
HAVING COUNT(*)>1 ;
SELECT * FROM table1 t1 CROSS JOIN table2 t2 ;
- Cartesian product (ANSI-SQL syntax 99)
SELECT * FROM table1 NATURAL JOIN table2;
- Natural join with columns of same name and type (N.B. no aliases allowed)
SELECT * FROM table1 t1 INNER JOIN table2 t2 ON (t1.col=t2.col);
SELECT * FROM table1 t1, table2 t2 WHERE t1.col=t2.col;- Inner join
SELECT * FROM table1 t1 LEFT OUTER JOIN table2 t2 ON (t1.col=t2.col);
SELECT * FROM table1 t1, table2 t2 WHERE t1.col=t2.col(+);- Left outer join (table2 is optional); the (+) goes on the side of the optional data
SELECT * FROM table1 t1 RIGHT OUTER JOIN table2 t2 ON (t1.col=t2.col);
SELECT * FROM table1 t1, table2 t2 WHERE t1.col(+)=t2.col;- Right outer join (table1 is optional); the (+) goes on the side of the optional data
SELECT * FROM (t1 RIGHT OUTER JOIN t2 ON (...)) INNER JOIN t3 ON (...);
- Combine joins
SELECT * FROM table1 t1 FULL OUTER JOIN table2 t2 ON (t1.col=t2.col);
- Full outer join (union all)
SELECT * FROM the_table WHERE (a, b, c) IN (SELECT a, b, c FROM another_table);
- Multiple fields for IN operator
Outer joins: the (+) is on the optional side. See traps below.
Insert
INSERT INTO table_name [(column1, column2)]
SELECT c1, c2
FROM ...;
INSERT INTO table_name [(column1, column2)]
VALUES ( a, b ) ;
Update
UPDATE table_name
SET GLU = A
, GLA = B
WHERE cond;
UPDATE table_name
SET GLU = NULL -- set to null
WHERE cond;
UPDATE table_a a
Subquery must return no more than one row for each row updated. If the subquery returns no rows, then the column is assigned a null. If needed, put "rownum=1" in the sub-query.
SET col1 = (SELECT col1
FROM table_c
WHERE ...)
, (col2, col3) = (SELECT expr2, expr3
FROM table_b b
WHERE a.col4 = b.col4)
WHERE ... ;
Note that the syntax
UPDATE FROM
is not available in Oracle. Use syntax "(col, col)=(SELECT expr, expr FROM ...)
" or use PL/SQL (see cursors)Access:
UPDATE
A INNER JOIN B ON A.ID = B.ID
SET A.x = B.y;
Delete
DELETE FROM table
WHERE cond;
DELETE FROM table WHERE (a, b, c) in (select a, b, c in table minus select a, b, c in table2)
Aggregate functions
SUM
COUNT, COUNT(ALL col)
COUNT(*) --
Counts the number of rowsCOUNT(col) --
Counts the number of rows with a value for "col".
Basically, equal to count(*) minus the number of rows with nulls for "col"COUNT(DISTINCT col) --
Number of distinct non-null valuesMAX, MIN
AVG, STDDEV, VARIANCE
create table abc (a number, b number);
insert into abc values(1,1);
insert into abc values(2,2);
insert into abc values(3,3);
insert into abc values(4,null);
insert into abc values(5,1);
insert into abc values(5,2);
insert into abc values(6,1);
insert into abc values(6,2);
insert into abc values(6,3);
insert into abc values(7,1);
insert into abc values(7,2);
insert into abc values(7,null);
insert into abc values(8,1);
insert into abc values(8,2);
insert into abc values(8,2);
commit;
select a, count(*), count(b), count(distinct b) from abc group by a order by a;
D
asdf
glu
asdf
glu
SELECT columns
, LPAD ( '/', 2*LEVEL, '-') || col
, LEVEL
, SYS_CONNECT_BY_PATH(col,'/') -- shows path to root, only 9i
FROM table
WHERE cond
START WITH condition_of_start_records
(such as: col2_pointing_to_col1 is null)
CONNECT BY PRIOR col1 = col2_pointing_to_col1
AND PRIOR col3 = col4_pointing_to_col3
AND another_condition
ORDER SIBLINGS BY col ; -- SIBLINGS:9i
Trick to create a comma-separated list:
SELECT LTRIM(SYS_CONNECT_BY_PATH (p,','),',')
FROM (SELECT ROWNUM r, the_column as p FROM a_table)
WHERE CONNECT_BY_ISLEAF = 1
START WITH r = 1
CONNECT BY r = PRIOR r + 1
ORDER BY p;
The LTRIM removes the first comma
Also known as Common Table Expressions (CTE).
WITH Clause
WITH w AS (SELECT ...),
WITH w2 AS (SELECT ...)
SELECT ...
FROM w,w2
;
SYSDATE, SYSTIMESTAMP
: Returns date and timestampUID, USER
CURRVAL, NEXTVAL
: Use with sequencesLEVEL
: Use with SELECT CONNECT BYROWID
: Returns the rowid or binary addressROWNUM
: row number before sortORA_ROWSCN
: returns system change number
SHOW USER
SELECT USER FROM DUAL;- Display current user
CASE
WHEN cond1 THEN expr1
WHEN cond2 THEN expr2
ELSE expr3
END- Case function in queries but not in PL/SQL.
Remember the "END
"; no expression after the "CASE
".select * from global_name;
- Display current database
New in 9i
WIDTH_BUCKET(field, min-value, max-value, n)
- Creates n+2 buckets, 1 for values less than min-value, 1 for values more than max-value, and n equally sized buckets between min-value and max-value.
COALESCE(arg1, arg2, arg3)
- Return first non-null value
NULLIF(Expr1, Expr2)
- Return null if the two expressions are equal, otherwise return the first expression
WITH query_name AS (subquery) [, query_name AS (subquery) ]...
- Sub-query factoring (see "WITH Clause" below)
CASE WHEN condition THEN value
WHEN condition2 THEN value2
[ELSE value ] END- case statement in select clause
Variables (look at this some more)
Variables ========== (PL/SQL in 21 days, p. 28) Bind variables: VARIABLE var_name type with type: - NUMBER - CHAR[(length)] - NCHAR[(length)] - VARCHAR2[(length)] - NVARCHAR2[(length)] - CLOB - NCLOB - REFCURSOR reference with: :var_name := 0 print var_name VARIABLE --> returns all variables
str1 [NOT] LIKE str2 [ESCAPE 'c']
str2 is the pattern. An underscore "_" matches exactly one character;
a percent "%" matches zero or more characters, but does not match a null.
Note: if the first character is not "%" or "_", then the index can be used.
Also called sub-query factoring. Useful when a subquery is executed multiple times. Note: only one "WITH"
WITH a_subquery_name AS (...put query here...) -- need ( )
, second_subquery_name AS (...)
SELECT -- a query using a_subquery_name multiple times
FROM a_subquery_name, second_subquery_name ...;
The /*+ MATERIALIZE */
hint tells the optimizer to create a temporary table, whereas the /*+ INLINE */
hint tells it to process inline.
Note: Oracle has three ways of materializing data:
Temporary tables:
create global temporary table
a_temp_table (field definitions)
on commit [preserve / delete] rows;
First few rows:
select * from A fetch first 5 rows only;
select * from A limit 5;
select * from A where rownum <= 5;
-- Oracle
select top 5 * from A;
-- SQL Server
Also known as "analytic functions."
These are performed as the last step in the query before the "order by."
The notes below are for Oracle
See Oracle User Group's Page
and wikipedia
and this
.
asdf
AVG(DISTINCT | ALL <expression>) OVER (analytic clause) CORR(<expression1>, <expression2>) OVER (<analytic clause>) COUNT(<*, [DISTINCT | ALL] <expression>>) OVER (<analytic clause>) COVAR_POP(<expression1>, <expression2>) OVER (<analytic clause>) COVAR_SAMP(<expression1>, <expression2>) OVER (<analytic clause>) CUME_DIST(<value>) OVER (<partition_clause> <order by clause>) DENSE_RANK() OVER (<query_partition_clause> <order_by_clause>) FIRST_VALUE(<expression> [IGNORE NULLS]) OVER (<analytic clause>) LAG(<value expression>, <offset>, <default>) OVER ([<query partition clause>] <order_by_clause>) <aggregate function> KEEP (DENSE_RANK LAST ORDER BY (<expression> <ASC | DESC> NULLS <FIRST | LAST>) LAST_VALUE (<expression> IGNORE NULLS) OVER (<analytic clause>) LEAD(<expression, offset, default>) [(<query_partition_clause>)] OVER (<order_by_clause>) MAX (<DISTINCT | ALL> expression) OVER (<analytic clause>) MIN (<DISTINCT | ALL> expression) OVER (<analytic clause>) NTILE (<expression>) OVER ([query_partition_clause] <order by clause>) NTILE (<expression>) OVER ([query_partition_clause] <order by clause>) PERCENT_RANK(<value>) OVER (<partition_clause> <order_by_clause>) PERCENTILE_CONT(<value>) WITHIN GROUP (ORDER BY <expression> [ASC | DESC]) OVER (<partition_clause>) PERCENTILE_DISC(<expression>) WITHIN GROUP (ORDER BY <order_by_clause>) RANK(<value>) OVER (<partition_clause> ORDER BY <order_by_clause>) RATIO_TO_REPORT(<value>) OVER (<partition_clause>) ROW_NUMBER() OVER (PARTITION BY ... ORDER BY ...) STDDEV([DISTINCT | ALL] <expression>) OVER (<analytic_clause>) STDDEV_POP(<expression>) OVER (<analytic_clause>) STDDEV_SAMP(<expression>) OVER (<analytic_clause>) SUM over VAR_POP(<value>) OVER (<analytic_clause>) VAR_SAMP(<value>) OVER (<analytic_clause>) VARIANCE([DISTINCT | ALL] <value>) OVER (<analytic_clause>) FUNCTION_NAME (<expression1>,<expression2>) OVER (<analytic_clause>) REGR_AVGX REGR_AVGY REGR_COUNT REGR_INTERCEPT REGR_R2 REGR_SLOPE REGR_SXX REGR_SXY REGR_SYY SELECT <aggregate_function(column_name)> KEEP (DENSE_RANK FIRST ORDER BY <column_name> [<ASC|DESC> NULLS <FIRST|LAST>) OVER (PARTITION BY <column_name>) FROM <table_name> GROUP BY <column_name>; Ignore the nulls with: (<column_name> IGNORE NULLS)
Compare counting method:
This counts records in the table:
select ..., (count(*) from A) as total, ... from A where ...;
whereas this counts the total in the result set defined by the where clause:
select ..., count(*) over () as total, ... from A where ...;
Get the row for the latest value of a column:
select *
from (select ...
, max(a_column) over (partition by b_column, c_column) as max_val
from a_table
)
where a_column = max_val;
select * from (
select xxx
, count(*) over (partition by columns...) as ct
from the_table
)
where ct > 1;
Ranking, etc
select
rank() over(order by value_count desc), --Ranking, ties get the same value, see table below
row_number() over (order by value_count desc), -- Sequential number, increases for each row, see table below
(sum(value_count) over (order by value_count desc, rownum))/num_rows -- Cumulative percentage, or use cume_dist
from a_table
order by value_count desc
A
row_number() over (order by A)
rank() over (order by A)
dense_rank() over (order by A)
1 1 1 1 1 2 1 1 1 3 1 1 2 4 4 2 2 5 4 2 3 6 6 3
Lead / lag
select lead(a_column [, offset [, default]]) over (partition by ... order by ...) as next_a_column
, a_column
from ...;
Same syntax for lag
lead(A, 2, -1) over (order by A)
lead(A, 1, -1) over (order by A)
A
lag(A) over (order by A)
lag(A, 2) over (order by A)
1 ↖ 1 ↖ 1 <null> (no default) <null> (no default) 2 ↖ 1 ↖ 1 ↘ 1 <null> (no default) 2 ↖ 2 ↖ 1 ↘ 1 ↘ 1 3 ↖ 2 ↖ 2 ↘ 1 ↘ 1 -1 (default) 3 ↖ 2 ↘ 2 ↘ 1 -1 (default) -1 (default) 3 ↘ 2 ↘ 2
List
Create a list with separators
select ..., listagg(col, ',') within group (order by ...) over (partition by ...) from a_table;
select ..., listagg(col, ',') within group (order by ...) from a_table group by ...;
Read:
http://www.oracle.com/technology/products/bi/db/10g/dbbi_tech_info_anl.html
http://www.ss64.com/orasyntax/an.html
http://www.oracle-base.com/articles/misc/RankDenseRankFirstLastAnalyticFunctions.php
Oracle and SQL Server both tested
true OR whatever = true
false AND whatever = false
"x IS NULL OR condition based on x
" makes sense (and "x IS NOT NULL OR condition based on x
" does not)
"x IS NOT NULL AND condition based on x
" makes sense (and "x IS NULL AND condition based on x
" does not)
SQL (Result from SQL queries):
| |||||||||||||||||||||||||||||||||
|
| ||||||||||||||||||||||||||||||||
|
|
Express (guide to programming language):
|
|
SELECT ... FROM ...
The "null_column" is a column that has some nulls (or all nulls)
WHERE null_column IS NULL
AND [NOT] (null_column > 0 -- this gives a null
AND
| OR
another_colomn
= another_colomn -- this gives a true
another_colomn
<> another_colomn -- this gives a false
);
If the query shows no results, put a NOT. If there are still no results, then
it is a null, otherwise the condition is true.
Some thoughts
select x
, sum(d2) as d2
, sum(d1) as d1
from (
select x
, case d when trunc(sysdate-1, 'DD') then ct else null end d1
, case d when trunc(sysdate-2, 'DD') then ct else null end d2
from (
select x
, d
from the_table
where ...
)
)
group by tfn
order by tfn ;
WHERE condition_A OR condition_B
AND condition_C
without proper grouping, this evaluates as follows:
WHERE condition_A OR (condition_B
AND condition_C)
which is probably not what you want.
So remember to put the brackets in the appropriate place:
WHERE (condition_A OR condition_B)
AND condition_C
Suppose there is no row with "a_col = whatever"
|
|
select count(*), 'whatever' |
returns one row with 0 (what I expect). Use no "group by" clause and put the condition in as a constant. |
select count(*) , a_col |
returns no row (what I do NOT expect). --> Prefer the first query |
Two notes about the ON clause of table joins:
Assume that the table A has rows that do not join to table B.
1) When the condition is on the left table in a left join, you may want to put the condition in the WHERE clause.
SELECT ... |
SELECT ... |
In a left join, the rows the rows that satisfy "condition_on_table_on_left"
are excluded, which is probably what you want to do.
Otherwise, the condition "NOT condition_on_table_on_left
" in the ON clause
restricts the only join and the rows that satisfy "condition_on_table_on_left
"
appear anyway but with no data from table_on_right showing.
2) When the condition is on the right table in a left join, you may want to put the condition in the ON statement.
SELECT ... |
SELECT ... |
In a left join, the condition "NOT condition_on_table_on_right
" in the WHERE clause
basically turns the join into an inner join.
create table tb1 (j number, a number);
create table tb2 (j number, b number);
insert into tb1 values (1,1);
insert into tb1 values (2,5);
insert into tb1 values (3,3);
insert into tb2 values (1,4);
insert into tb2 values (3,3);
commit;
select * from tb1;
select * from tb2;
select * from tb1 left join tb2 on tb1.j = tb2.j ;
select * from tb1 left join tb2 on tb1.j = tb2.j and tb1.a is null;
select * from tb1 left join tb2 on tb1.j = tb2.j where tb1.a is null;
select * from tb1 left join tb2 on tb1.j = tb2.j and tb2.b is null;
select * from tb1 left join tb2 on tb1.j = tb2.j where tb2.b is null;
select * from tb1 left join tb2 on tb1.j = tb2.j ;
select * from tb1 left join tb2 on tb1.j = tb2.j and tb1.a <> 0;
select * from tb1 left join tb2 on tb1.j = tb2.j where tb1.a <> 0;
select * from tb1 left join tb2 on tb1.j = tb2.j and tb2.b <> 0;
select * from tb1 left join tb2 on tb1.j = tb2.j where tb2.b <> 0; -- one row less
select * from tb1 left join tb2 on tb1.j = tb2.j ;
select * from tb1 left join tb2 on tb1.j = tb2.j and tb1.a <> 3;
select * from tb1 left join tb2 on tb1.j = tb2.j where tb1.a <> 3;
select * from tb1 left join tb2 on tb1.j = tb2.j and tb2.b <> 3;
select * from tb1 left join tb2 on tb1.j = tb2.j where tb2.b <> 3;
select * from tb1 left join tb2 on tb1.j = tb2.j ;
select * from tb1 left join tb2 on tb1.j = tb2.j and tb1.a <> 5;
select * from tb1 left join tb2 on tb1.j = tb2.j where tb1.a <> 5;
select * from tb1 left join tb2 on tb1.j = tb2.j and tb2.b <> 5;
select * from tb1 left join tb2 on tb1.j = tb2.j where tb2.b <> 5;
Watch out with null strings and empty, zero-length strings:
rtrim(' ') is NOT null
rtrim(' ') = ''
SQL Statement | Result |
---|---|
where trunc(a_date, 'DD') <= to_date('2008', 'YYYY'); |
Up to and including 1st Jan 2008! |
where trunc(a_date, 'DD') < to_date('2008', 'YYYY'); |
Up to and including 31st Dec 2007 Excludes 2008 and later |
where trunc(a_date, 'DD') >= to_date('2008', 'YYYY'); |
Starts with 1st Jan 2008 |
where trunc(a_date, 'DD') > to_date('2008', 'YYYY'); |
Starts with 2nd Jan 2008. Excludes 1st Jan! |
Null does not equal null. Therefore, care must be taken when joining on columns that have nulls.
table tb1 table tb2
A B A B
------ ------ ------ ------
1 1 1 1
1 2 1 2
1 (null) 1 (null)
select * from tb1, tb2 where tb1.a = tb2.a and tb1.b = tb
A B A B
------ ------ ------ ------
1 1 1 1
1 2 1 2
select * from tb1, tb2 where tb1.a = tb2.a and nvl(tb1.b,0) = nvl(tb2.b,0)
A B A B
------ ------ ------ ------
1 1 1 1
1 2 1 2
1 (null) 1 (null)
Notice the extra row with the nulls.
select level r from dual connect by level <= 6; --
Generates numbers from 1 to 6 in 9i and later (
Thanks Tom)
select b.b, add_months(b,a.a-1) from (select to_date('2017', 'YYYY') as b from dual) b , (select level a from dual connect by level<=12) a; --
12 months of a year
SQLite is useful for small websites, replacement for temporary files, local dataset analysis,
stand-in for demonstrations, prototypes, and testing.
Use another DBMS for client/server applications, high-volume websites,
very large datasets (over 2 terabytes or 241 bytes)),
and configurations with high concurrency.
Start: sqlite3 a_file_name.db
.help
--> help
.databases
--> List available database files
.tables
--> List the tables, views, and temporary tables
.indices table_name
--> List indexes
.read filename
--> Read and execute commands from file
.schema table_name
--> Shows the ddl for the table
.mode the_mode
--> Define output mode. Notice insert new_table_name
mode useful for creating insert scripts.
Modes are "csv", "column", "html", "insert", "line", "list", "tabs", and "tcl".
.explain select ... from ...;
--> Shows the explain plan and does not execute
VACUUM
rebuilds the database file, for example when a lot of rows were deleted
.quit
or .exit
--> quit
select sqlite_version();
In windows: up arrow to get to previously used commands.
See the database schema "sqlite_master
" table.
Temporary tables in "sqlite_temp_master
"
-- comment
/* comment */
.separator "|"
.import file_name table_name
.separator \t
for tab-delimited
.mode ...
or
.output file_name
.dump table_nameselect ....;
.output stdout
Backup:
.output ${THE_DUMP_FILE}
.dump
.output stdout
Restore:
sqlite3 ${DB_FILE_NAME} < ${SQL_DUMP_FILE}
Firefox database in C:\Documents and Settings\...\Application Data\Mozilla\Firefox\Profiles\...\places.sqlite
Put this in a .bat file: C:\progfile\sqlite\bin\sqlite3.exe C:\progfile\sqlite\data\the_name.db
For the PDO object in php, remove the ";" before "extension=php_pdo_sqlite.dll
" in the php.ini.
JDBC driver from https://bitbucket.org/xerial/sqlite-jdbc/downloads/
Connection string (?): jdbc:sqlite:c:/path/dbfile.db
Driver class name (?): org.sqlite.JDBC
You can put what you want in any column except in an integer primary key. See data type documentation. Storage classes:
type affinities:
It looks like these are the least ambiguous ways to define the columns:
The following are supported: UNION [ALL], INTERSECT, EXCEPT
.
Create table:
CREATE TABLE the_table
( A integer primary key -- this makes an autoincrement field
, B text
, C integer
, D REFERENCES parent_table(a_pk)
, FOREIGN KEY(a_fk, b_fk) REFERENCES parent_table(a_pk, b_pk) -- foreign key constraint (allows a_fk to be null)
);
create [unique]index the_name on the_table(A, B);
Are foreign keys enabled? PRAGMA foreign_keys;
Enable with PRAGMA foreign_keys = ON;
select 'xxx'
Prints out 'xxx'
Truncate table: delete from table;
without a where clause
drop table if exists the_table_name;
In SQLite, do commit on the connection object:
cn = sqlite3.connect('db')
inserts/updates/deletes...
cn.commit()
The pseudocolumn ROWID
starts at 1 on first row. On new inserts, it has the same value as a column defined as INTEGER PRIMARY KEY
.
ifnull(a,b)
select changes();
Note: an aggregate query without a GROUP BY clause returns one row. The non-aggregate fields are taken from random source rows. (I think this is the case)
Sample date and time queries:
Note: the best appears to be to store dates and times as strings in the "YYYY-MM-DD HH24:MM:SS" format
(see IS0-8601)
Or use integer data type in the database: insert with datetime('now')
, query with datetime(date_column /1000 , 'unixepoch', 'localtime')
Current epoch time: SELECT strftime('%s', 'now');
Current time in local timezone: select strftime('%Y/%m/%d %H:%M:%f', datetime('now', 'localtime')) ;
Insert time: insert into d(dttm) values (datetime('2016-01-12 13:12:11')) ;
(corresponds to SQL server convert(..., 120))
Calculate date difference: SELECT (julianday('now', 'localtime') - julianday(dttm)), datetime( 'now', 'localtime'), dttm from d;
Calculate time difference: SELECT (strftime('%s','now', 'localtime') - strftime('%s',dttm))/60, datetime( 'now', 'localtime'), dttm from d;
Rows created in the past 3 days: where create_date > date('now', '-3 days')
Current date (no time portion): date('now')
Current timestamp in utc: datetime('now')
Current date, see below for modifiers: date('now')
An hour ago: date('now', '-1 hour')
. Also '-10 hours', '-1 day'
Local time, assuming the string to the left is UTC: date('now', 'localtime')
. 'utc' adjusts for UTC.
Functions related to strings:
Concatenation (like in Oracle): col_a || col_b
replace('main_string', 'to_look_for', 'new_string')
: replace 'to_look_for' with 'new_string' in the 'main_string'.
quote(string)
: prepares script for inclusion into an SQL statement.
char(integer)
: returns a character.
unicode(string)
: returns the unicode character code of the first character in the string.
length(string)
Length of a character string
cast('3.14' as decimal)
Data conversion
cast(3.14 as text)
Data conversion
substr(a_string, n [, m])
Substring of a_string starting at position n and m characters long
group_concat(col[,sep])
aggregates non-null string values, with optional separator
glu
See more at www.sqlite.org/lang_datefunc.html
Command line shell
Language Reference
Language Reference: expressions
Handling of nulls
Core functions
PDO statement
fuser database_file.db
. Stop the process if necessary.
json in Snowflake table:
create table table raw_src (rawj variant);
{“level1field”: “a string”,
“level1obj”: [
{“level2field1”: 123, “level2fld2”: “abc”, “level2obj”: {……}},
{“level2field1”: 456, “level2fld2”: “dfe” , “level2obj”: {……}}
],
“level2obj”: {level3obj: [
{“level2field1”: 123, “level2fld2”: “abc”, “level2obj”: {……}},
{“level2field1”: 456, “level2fld2”: “dfe” , “level2obj”: {……}}
]
}
}
select rawj:level1field -> with “” around strings
select rawj:level1field::string -> removes “” around strings
Options: ::string, ::number, ::variant
-- Query for example shown above
create or replace table raw_src (data variant)
select a.data:level1field::string -- value is repeated across multiple rows
, b.value:level2field1::number
, b.value:level2fld2::string
, b.value:level2obj::variant
, c.value:level2field1::number
, c.value:level2fld2::string
, c.value:level2obj::variant
from raw_src a
, lateral flatten(a.data:level1obj) b
, lateral flatten(b.value:level2obj.level3obj) c
;
select a.data:f1::string
, a.data:f2::int
, a.data:f3::boolean
, a.data:f2::variant -- see below on next steps
from raw_src a
;
For any field f of type variant that starts with {"g": ...',
look at the next level with the dot notation as in the following:
select a.data:f::variant
, a.data:f.g
, a.data:f.g2
, a.data:f.g3
from raw_src a
For any field f of type variant that starts with [
flatten the next level as in the following:
select a.data:f::variant -- field to flatten. Put below in the 'flatten' clause
, b.value -- it is always 'value'
from raw_src a
, lateral flatten (a.data:f) b