Loading...

write each row as string inside foreach iteration without NullPointerException?


Trying to capture a string value generated by foreach iteration through all rows of a dataframe, after replacing values. Here is an excerpt of the code block which I am using, simplified it but this is the same logic used. Both the last two lines are giving NullPointerException:

{
    val finalResult = hiveContext.sql("""SELECT * from """ + temptableName) 

    if (finalResult.count() > 0) {
    finalResult.foreach { row => 

        RealtimeUtil.loadStreamProperties("TestFlow.properties")
           val tag1 = row.getAs("new_val1").asInstanceOf[String]
           val tag2 = row.getAs("new_val2").asInstanceOf[String]


        val string = """This is a template string with some value
            "-Name":"TAG1", "#text":"****",
            "-Name":"TAG2", "#text":"****"
            """

        val string2 = string
        .replace(""""-Name":"TAG1", "#text":"****"""", """"-Name":"TAG1", "#text":"""" + new_val1 +"\"")
        .replace(""""-Name":"TAG2", "#text":"****"""", """"-Name":"TAG2", "#text":"""" + new_val2 +"\"")

        hiveContext.sql("""insert into table testdb.final_table values ("""" + string2 + ")""")
        \\above statement failing with NPE
        val someDF = Seq((1, string2)).toDF("seq", "value").coalesce(1).select("value").write.format("text").mode("append").save("/tmp/output")
        \\above statement failing with NPE
        } 
    }
}

Completely new to spark and scala with very little experience in java, still learning the ropes, if someone can help and explain why getting NullPointerException it would be great. How can I get around the issue and still extract the string2 value and write it to hdfs from each microbatches? I am looking for a solution on how to capture "string2" from inside the foreach looping the DF. (assume all initialization is done, and temptableName is a tempView created from a dataframe with fields e.g. new_val1, new_val2)

- - Source
comments powered by Disqus