Monday, April 25, 2022

JDBC - Data type Mapping for Snowflake

 It is important to understand how we map various JDBC data types for the Snowflake data types.

BITBOOLEAN
TINYINTSMALLINT
SMALLINTSMALLINT
INTEGERINTEGER
BIGINTBIGINT
FLOATFLOAT
REALFLOAT
DECIMAL(m,n)NUMBER(m,n)
NUMERIC(m,n)NUMBER(m,n)
DOUBLEFLOAT
CHAR(m)CHAR(m)
LONGVARCHARSTRING
LONGNVARCHARSTRING
NCHAR(m)STRING
NVARCHARVARCHAR(m)
VARCHAR(m)VARCHAR(m)
DATEDATE
TIMETIMESTAMP
TIMESTAMPTIMESTAMP
ARRAYSTRING
BINARYBINARY
VARBINARYBINARY
LONGVARBINARYBINARY
JAVA_OBJECTSTRING
BLOBBINARY
CLOBSTRING
NCLOBSTRING
STRUCTSTRING
REFSTRING
BOOLEANBOOLEAN
ROWIDSTRING
SQLXMLSTRING
OTHERSTRING
NULLSTRING
DISTRICTSTRING
DATALINKSTRING

How-to-generate-log-file-on-Snowflake-connector- SnowSQL

 

SnowSQL

  • What is the Snowflake SnowSQL version?

          > snowsql --version

          > snowsql --bootstrap-version

          > snowsql --versions

  • Is it reproducible with the latest SnowSQL version?
  • Is SnowSQL used interactively or in a batch script?
  • Generate log files:
    • Add -o log_level=DEBUG to the usual command line arguments, or if you prefer to do it in the SnowSQL config file, update log_level=DEBUG:
      • Linux or Mac: ~/.snowsql/config
      • Windows: %USERPROFILE%\.snowsql\config
    • If you're generating DEBUG level logs to troubleshoot a specific issue, you might consider creating a separate logfile while reproducing this issue. This can be done with adding -o log_file, such as:
      • Linux or Mac: snowsql -o log_level=DEBUG -o log_file=/path/to/snowsql_debug.log <rest of the usual arguments>
      • Windows: snowsql.exe -o log_level=DEBUG -o log_file=C:\Temp\snowsql_debug.log <rest of the usual arguments>
  • Collect log files:
    • Linux or Mac: ~/.snowsql/log*
    • Windows: %USERPROFILE%\.snowsql\log*
    • or if using log_file, collect the logfile from the path specified in log_file. E.g. C:\Temp\snowsql_debug.log in the above example.
  • Documentation reference:

Tuesday, April 19, 2022

replace_inside_parentheses

Code

def replace_inside_parentheses(string, find_string, replace_string):
bracket_count = 0
return_string = ""
for a in string:
if a == "(":
bracket_count += 1
elif a == ")":
bracket_count -= 1
if bracket_count > 0:
return_string += a.replace(find_string, replace_string)
else:
return_string += a
return return_string


my_str = "decimal(12,01)"
print(my_str)

print(replace_inside_parentheses(my_str, ",", "%")) 

Friday, April 8, 2022

spark salting technique explained

 the fallowing is for spark salting technique



var df1 = Seq((1,"a"),(2,"b"),(1,"c"),(1,"x"),(1,"y"),(1,"g"),(1,"k"),(1,"u"),(1,"n")).toDF("ID","NAME")


df1.createOrReplaceTempView("fact")


var df2 = Seq((1,10),(2,30),(3,40)).toDF("ID","SALARY")


df2.createOrReplaceTempView("dim")


val salted_df1 = spark.sql("""select concat(ID, '_', FLOOR(RAND(123456)*19)) as salted_key, NAME from fact """)


salted_df1.createOrReplaceTempView("salted_fact")


# remember that explode array should start with 0 and end with whatever the values we mentioned in the above flor and rand

val exploded_dim_df = spark.sql(""" select ID, SALARY, explode(array(0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19)) as salted_key from dim""")


//val exploded_dim_df = spark.sql(""" select ID, SALARY, explode(array(0 to 19)) as salted_key from dim""")


exploded_dim_df.createOrReplaceTempView("salted_dim")


val result_df = spark.sql("""select split(fact.salted_key, '_')[0] as ID, dim.SALARY 

            from salted_fact fact 

            LEFT JOIN salted_dim dim 

            ON fact.salted_key = concat(dim.ID, '_', dim.salted_key) """)

display(result_df)

Wednesday, April 6, 2022

Python printing the log in console

 import logging


logger = logging.getLogger("Dev")
logger.setLevel(logging.INFO)

consoleHandler = logging.StreamHandler()
consoleHandler.setLevel(logging.INFO)
formatter = logging.Formatter('%(asctime)s - %(name)s - %(levelname)s - %(message)s')
consoleHandler.setFormatter(formatter)

logger.addHandler(consoleHandler)

logger.info("This is printing in console")

Recent Post

Databricks Delta table merge Example

here's some sample code that demonstrates a merge operation on a Delta table using PySpark:   from pyspark.sql import SparkSession # cre...