My Experiences : April 2022

Monday, April 25, 2022

JDBC - Data type Mapping for Snowflake

It is important to understand how we map various JDBC data types for the Snowflake data types.

BIT	BOOLEAN
TINYINT	SMALLINT
SMALLINT	SMALLINT
INTEGER	INTEGER
BIGINT	BIGINT
FLOAT	FLOAT
REAL	FLOAT
DECIMAL(m,n)	NUMBER(m,n)
NUMERIC(m,n)	NUMBER(m,n)
DOUBLE	FLOAT
CHAR(m)	CHAR(m)
LONGVARCHAR	STRING
LONGNVARCHAR	STRING
NCHAR(m)	STRING
NVARCHAR	VARCHAR(m)
VARCHAR(m)	VARCHAR(m)
DATE	DATE
TIME	TIMESTAMP
TIMESTAMP	TIMESTAMP
ARRAY	STRING
BINARY	BINARY
VARBINARY	BINARY
LONGVARBINARY	BINARY
JAVA_OBJECT	STRING
BLOB	BINARY
CLOB	STRING
NCLOB	STRING
STRUCT	STRING
REF	STRING
BOOLEAN	BOOLEAN
ROWID	STRING
SQLXML	STRING
OTHER	STRING
NULL	STRING
DISTRICT	STRING
DATALINK	STRING

How-to-generate-log-file-on-Snowflake-connector- SnowSQL

SnowSQL

What is the Snowflake SnowSQL version?

> snowsql --version

> snowsql --bootstrap-version

> snowsql --versions

Is it reproducible with the latest SnowSQL version?
Is SnowSQL used interactively or in a batch script?
Generate log files:
- Add -o log_level=DEBUG to the usual command line arguments, or if you prefer to do it in the SnowSQL config file, update log_level=DEBUG:
  - Linux or Mac: ~/.snowsql/config
  - Windows: %USERPROFILE%\.snowsql\config
- If you're generating DEBUG level logs to troubleshoot a specific issue, you might consider creating a separate logfile while reproducing this issue. This can be done with adding -o log_file, such as:
  - Linux or Mac: snowsql -o log_level=DEBUG -o log_file=/path/to/snowsql_debug.log <rest of the usual arguments>
  - Windows: snowsql.exe -o log_level=DEBUG -o log_file=C:\Temp\snowsql_debug.log <rest of the usual arguments>
Collect log files:
- Linux or Mac: ~/.snowsql/log*
- Windows: %USERPROFILE%\.snowsql\log*
- or if using log_file, collect the logfile from the path specified in log_file. E.g. C:\Temp\snowsql_debug.log in the above example.
Documentation reference:
- SnowSQL logging properties

Tuesday, April 19, 2022

replace_inside_parentheses

Code

def replace_inside_parentheses(string, find_string, replace_string):
    bracket_count = 0
    return_string = ""
    for a in string:
        if a == "(":
            bracket_count += 1
        elif a == ")":
            bracket_count -= 1
        if bracket_count > 0:
            return_string += a.replace(find_string, replace_string)
        else:
            return_string += a
    return return_string


my_str = "decimal(12,01)"
print(my_str)

print(replace_inside_parentheses(my_str, ",", "%"))

Friday, April 8, 2022

spark salting technique explained

the fallowing is for spark salting technique

var df1 = Seq((1,"a"),(2,"b"),(1,"c"),(1,"x"),(1,"y"),(1,"g"),(1,"k"),(1,"u"),(1,"n")).toDF("ID","NAME")

df1.createOrReplaceTempView("fact")

var df2 = Seq((1,10),(2,30),(3,40)).toDF("ID","SALARY")

df2.createOrReplaceTempView("dim")

val salted_df1 = spark.sql("""select concat(ID, '_', FLOOR(RAND(123456)*19)) as salted_key, NAME from fact """)

salted_df1.createOrReplaceTempView("salted_fact")

# remember that explode array should start with 0 and end with whatever the values we mentioned in the above flor and rand

val exploded_dim_df = spark.sql(""" select ID, SALARY, explode(array(0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19)) as salted_key from dim""")

//val exploded_dim_df = spark.sql(""" select ID, SALARY, explode(array(0 to 19)) as salted_key from dim""")

exploded_dim_df.createOrReplaceTempView("salted_dim")

val result_df = spark.sql("""select split(fact.salted_key, '_')[0] as ID, dim.SALARY

from salted_fact fact

LEFT JOIN salted_dim dim

ON fact.salted_key = concat(dim.ID, '_', dim.salted_key) """)

display(result_df)

Wednesday, April 6, 2022

Python printing the log in console

import logging


logger = logging.getLogger("Dev")
logger.setLevel(logging.INFO)

consoleHandler = logging.StreamHandler()
consoleHandler.setLevel(logging.INFO)
formatter = logging.Formatter('%(asctime)s - %(name)s - %(levelname)s - %(message)s')
consoleHandler.setFormatter(formatter)

logger.addHandler(consoleHandler)

logger.info("This is printing in console")

My Experiences