Reading Text Data From Files

Let us see how we can read text data from files into a data frame. also have APIs for other types of file formats, but we will get into those details later.

  • We can use or to read text data.
  • can be used for comma separated data. Default field names will be in the form of _c0,_c1 etc. We can pass the delimiter using the keyword argument using sep to
  • We can also use with the file type. We can use schema to define schema, option such as sep to pass delimiter and load to load data from a given location into Data Frame.
  • can be used to read fixed length data where there is no delimiter. Default field name is value.
  • We can also define attribute names using the toDF function
  • In either of the case data will be represented as strings
  • We can convert data types by using cast function –
  • We will see all other functions soon, but let us perform the task of reading the data into the data frame and represent it in their original format.

Share this post