Tuesday, September 8, 2020

How do implicit work in Spark/Scala ?

Q: 

Here is a sample Spark code, which converts the Seq to the Dataset:

import spark.implicits._
val s = Seq(1, 2, 3, 4)
val ds = s.toDS()

The Scala Seq does not have the toDS method, it comes from Spark implicits. How is the Dataset created here?

A:

Scala has a way to add methods to existing classes, like extension method in Kotlin (and C# as I remember), but does it in a different way, through implicits.

To add the method to existing class, you first create implicit class:

object StringImplicits {
  implicit class StringUtils(s: String) {
    def someCoolMethod = println("Yooo")
  }
}

object Application extends App {
    import StringImplicits._
    val s = "Hello"
    s.someCoolMethod
}

You import this StringUtils and can call someCoolMethod on instance of String

Notice that StringUtils class takes String as a constuctor param.

When calling some method on String, scala compiler first looks this method in String class.

If it does not find it, it will look imported implicit classes which take String param.

If found, it calls the method from that class.

If no such class found, it will raise the error.


No comments:

Post a Comment

Recent Post

Databricks Delta table merge Example

here's some sample code that demonstrates a merge operation on a Delta table using PySpark:   from pyspark.sql import SparkSession # cre...