Let us understand about
- Pre-defined Functions
- User-defined Functions
- Higher Order and Anonymous/lambda Functions
- Operators
Pre-defined Functions
Most of the languages comes with pre-defined functionality in the form of functions. Let us understand the philosophy by going through string manipulation in detail.
- Scala is object oriented programming language and all the functions are part of classes
- Even operators (+, -, > etc) are functions, but brackets and dots are optional while invoking functions with some limitations.
- e.g.: Adding 2 numbers 1 + 2 and 1.+(2) are same. + is also a function
- String manipulation is very common as part of processing data
- Create string variable
val s = "Hello"
- Equals
s.==("Hello")
or s == "Hello"
- Not Equals
s.!=("Hello")
or s != "Hello"
- Compare with partial string
s.contains("ell")
or s contains "ell"
- Get part of the string
s.substring(2)
-> “llo” or s.substring(2, 4)
-> “ll”
- Get the position of first occurrence of the string from beginning s.indexOf(“ll”)
- Create another string
val o = "1,2013-07-25 00:00:00.0,11599,CLOSED"
- Extract date using substring from above string
- Converting string o to a list
o.split(",")
, we can access elements using index e.g.: o.split(",")(1)
gives us date
- toLowerCase and toUpperCase can be used to convert case
- Type casting can be done using functions like toInt
- Substring can be replaced with some other string using replace
- Get size of the string
o.size
- Other important functions
- startsWith and endsWith
- concat
- reverse
- equals and equalsIgnoreCase
- Having good knowledge of string manipulation functions is very important for Data Engineering using any programming language
Exercises
- Exercise 1: Extract date and change format
- Create variable
val o = "120,2013-07-25 00:00:00.0,100,CLOSED"
- Extract date from o and change the format of date to YYYYMMDD
- Output should be 20130725
- Exercise 2: Convert order item to array of elements using “,” as delimiter
- Create variable oi for 2,2,1073,1,199.99,199.99
- Split using “,” and assign it to variable oiArray
- Exercise 3: Extract order item subtotal and add them
- Create 3 string variables for below records – oi1, oi2, oi3
- 2,2,1073,1,199.99,199.99
- 3,2,502,5,250.0,50.0
- 4,2,403,1,129.99,129.99
- Get order item subtotal (5th field) from above records
- Add them and assign the result to variable orderRevenue
- Print order revenue for order id 2 is <output for orderRevenue should come here>
- Exercise 4: Check whether order status is COMPLETE
- Order Status is last field in the delimited string
- Create 3 string variables – o1, o2, o3
- 1,2013-07-25 00:00:00.0,11599,CLOSED
- 2,2013-07-25 00:00:00.0,256,PENDING_PAYMENT
- 3,2013-07-25 00:00:00.0,12111,COMPLETE
- Print true if order status is COMPLETE other wise print false
- Exercise 5: Extract order status from below strings
- Order Status is last field in the delimited string
- Create 3 string variables – o1, o2, o3
- 1,2013-07-25 00:00:00.0,11599,CLOSED
- 2,2013-07-25 00:00:00.0,256,PENDING_PAYMENT
- 3,2013-07-25 00:00:00.0,12111,COMPLETE
- Create 3 variables – o1OrderStatus, o2OrderStatus, o3OrderStatus which contain only order status
User-defined Functions
As part of this topic we will see functions/definitions in detail
- Functional programming is a programming paradigm (others: Imperative and logical)
- Advantages of Functional programming
- Simple reasoning principles
- Better modularity
- Good for leveraging multicore for parallelism and cloud computing
- Functions are expressions (not statements)
- Functions can be nested
- Functions can be assigned to variables
- Functions can be returned, passed as arguments
- Even though Scala supports both call by value and call by name, default is call by value. It is also recommended to use call by value. Do not worry too much about difference at this time.
Task 1 – Factorial
Develop a function which will return factorial of a given number
Task 2 – Fibonacci
Develop a function which will print number of elements in a Fibonacci series
Task 3 – Factorial Recursive
Develop a function which will return factorial of a given number recursively
Task 4 – Combinations
Given 2 arguments n and r compute nCr (n! / ((n-r)! * r!))
Exercise – isFibonacci
Given 1 argument which takes an integer return true if the number belongs to fibonacci series else return false (eg: isFibonacci(13) should return true and isFibonacci(24) should return false)
Higher Order and Anonymous/lambda Functions
As part of this topic we will see higher order functions and anonymous functions
- In Scala a function can be a parameter, a return variable
- If the parameter is a function then we need to define similar to regular function (eg: sum takes function as parameter)
- We should not define functionality for the function which is defined as parameter while creating a function (sum in this case, we only declare f: Int => Int)
- While invoking we need to provide the functionality for the parameters which are defined as functions (for parameter f in this case)
- While invoking sum, the value for f can be a simple function, a variable or anonymous function
- See the example below
- Anonymous functions need not have name associated with it
- Anonymous functions can be assigned to variables
- Those variables can be passed as parameters (f in this case)
- We can also directly pass anonymous functions as well while invoking the functions which have parameter defined as function (main function is sum, f is the parameter which accept anonymous function as parameter value)
Operators
Let us explore different operators in Scala
- Operators are all functions
- In Scala functions can be invoked with out using ., () etc with some restrictions
- Even numeric operators such +, -, *, / etc are functions
- We can implement functionality for all the operators for any class (in the form of functions)
- == is also a function which invoke equals operator
- We can use either equals or == to compare 2 objects
- As == invokes equals, if equals is overridden then == will automatically overridden
- eq is the function which will compare whether 2 objects pointing to same byte address of the object (similar to == in Java)
val (a, b) = (1, 2); a + b
and val (a, b) = (1, 2); a.+(b)
are same
- For Boolean we have functions/operators such as && (and), || (or) and ! (negation)