Install and Configure Sentry

Let us see the details about how to install and configure sentry.

  • Sentry service is an RPC server that stores authorization metadata in an underlying relational database.
  • It provides RPC interfaces to retrieve and manipulate privileges.
  • We can integrate with Kerberos for Security.
  • The service serves authorization metadata from the database backed storage; it does not handle actual privilege validation.
  • The Hive, Impala, and Solr services are clients of this service.
  • Sentry privileges are enforced when they are configured to use Sentry.
Prerequisites
  • Java must be installed on all client nodes and configure $JAVA_HOME
  • Make sure cluster is running and Kerberised or for testing purpose without Kerberos set sentry.hive.testing.mode to true once Sentry service is added.
    • Cloudera Manager -> Hive -> Configuration -> Sentry Service Advanced Configuration Snippet (Safety Valve) for sentry-site.xml
  • To define a role and give privileges mapped to a user group, make sure that user group created on all the nodes.
Install Sentry

We can use Cloudera Manager to setup Sentry.

  • Make sure Database is created for Sentry. We have MySQL running on bigdataserver-1, let us setup database by name sentry in it.

https://gist.github.com/dgadiraju/cd933a2db43d5a3a72fd8ae1a02be895

  • We also need to make sure that mysql-connector-java is installed on the node where we are going to configure Sentry Server. In our case it is bigdataserver-4. We have already installed earlier and can validate by running ls -ltr /usr/share/java/mysql-connector-java.jar
  • In case if you could not find MySQL Connector, we can install by using sudo yum -y install mysql-connector-java
  • Go to Add Service -> Choose Sentry
  • Choose Sentry Server (bigdataserver-4) and Gateway (bigdataserver-1)
  • Add Database details
  • Complete Setup Process

Configure Sentry

We can use Sentry with different high-level services such as Hive, Impala and Hue.

  • Changing the Hive Warehouse permissions
  • Disable Impersonation
  • Make sure system users such as Hive, Impala can run YARN jobs in the cluster.
  • Block Hive CLI access
  • Enable Sentry in the Hive and Impala
  • Enable Sentry in Hue
  • Add Sentry Admin Group

Enabling the Sentry Service for Hive, Impala and Hue

We will setup Sentry for all 3 services.

Let us see how to configure Hive to use Sentry for authentication and authorization.

  • Typically we give 777 permissions on /user/hive/warehouse and enable impersonation so that the usernames are listed as actual users who are running Hive Queries even though the queries are run by Hive user itself.
  • With Sentry, the queries will be submitted by actual users itself and hence we need to disable impersonation.
  • Changing the permissions and ownership for warehouse directory
    sudo -u hdfs hdfs dfs -chmod -R 775 /user/hive/warehouse
    sudo -u hdfs hdfs dfs -chown -R hive:hive /user/hive/warehouse
  • Disable Impersonation – To run the jobs from Hue as a Hive user instead of the individual user identities for YARN. Enabling HiveServer2 impersonation bypasses Sentry from the end-to-end authorization process. Let us see how to disable impersonation for HiveServer2 in the Cloudera Manager Admin Console
    • Go to the Hive service -> Configuration tab.
    • Select Scope -> HiveServer2 & Category -> Main.
    • Uncheck the HiveServer2 Enable Impersonation checkbox.
    • Click Save Changes to commit the changes.
  • Enable System Users to Submit YARN jobs – Since we have disabled hive impersonation, now we will make sure to add hive user in YARN configuration to be able to submit jobs.
    • If you are using YARN, to enable the Hive user to submit YARN jobs.
    • Go to the YARN service -> Configuration tab.
    • Select Scope -> NodeManager & Category -> Security.
    • Ensure the Allowed System Users property includes the hive user. If not, add hive.
    • Click Save Changes to commit the changes.
    • Repeat steps 1-6 for every NodeManager role group for the YARN service that is associated with Hive.
    • Restart the YARN service.
  • Block Hive CLI Access – This is used to block Hive CLI access to regular users who are not part of groups such as Hive and Hue.
    • Go to  Hive service -> Configuration tab.
    • Locate the hadoop.proxyuser.hive.groups parameter and click the plus sign.
    • Enter hive into the text box and click the plus sign again.
    • Enter hue into the text box and the sentry also.
    • Click Save Changes
  • Here we will be configuring the hive service to use the sentry.
    • Go to the Hive service.
    • Click the Configuration tab.
    • Select Scope -> Hive (Service-Wide).
    • Select Category- > Main.
    • Locate the Sentry Service property and select Sentry.
    • If there is any validation error to be fixed, click on the error and check “Enable Stored Notifications in Database”.
    • Click Save Changes to commit the changes.
    • Restart the Hive service.

Note: Make sure to set sentry.hive.testing.mode to true.

Enabling the Sentry Service for Impala

This step is to enable sentry privileges for the Impala service.

  • Go to the Impala service.
  • Click the Configuration tab.
  • Locate the Sentry Service property and select Sentry.
  • Click Save Changes to commit the changes.
  • Restart the Impala service.
Enable the Sentry Service for Hue

Sentry privileges will be enabled to determine which Hive / Impala databases and tables a user can see or modify from the Hue. The user who is logging into the Hue must have equivalent OS-level user account on all hosts to authenticate the user. And the user group also should be as the user group to whom privileges are given.

  • Go to the Hue service.
  • Click the Configuration tab.
  • Select Scope -> Hue (Service-Wide).
  • Select Category -> Main.
  • Locate the Sentry Service property and select Sentry.
  • Click Save Changes to commit the changes.
  • Restart Hue.

Add Sentry Admin Group

We can add the group in which users who are part of the specific group can create roles and corresponding privileges.

  • Go to the Sentry service.
  • Click the Configuration tab.
  • Locate the “Admin Groups” property and add the group of users (E.g.:sentryadmin) who can be the Sentry admin.
  • Click Save Changes to deploy client configuaration.

Once you are done with the configurations, you can create the roles and privileges for the users.

Creating a user in sentryadmin group

Here we will adding itversity user to sentryadmin group who can create the roles.

https://gist.github.com/dgadiraju/8e3d8d4968cd927a3fcfc682f7bc3b07

Creating Roles and Grant appropriate Permissions

Launch to beeline shell as SentryAdmin user

beeline
!connect jdbc:hive2://bigdataserver-4.c.smooth-unison-219405.internal:10000/default

Give the username and password as itversity and user password. Then we will log in as sentry admin user who can create roles and privileges.

  • To create an admin role who can access all the databases on the Hive Server
Create role admin
GRANT ALL ON SERVER server1 TO ROLE admin;
GRANT ROLE admin TO GROUP sentryadmin;
  • To create developers role who can access to specific DB (retail_db) in this case.
Create role developers;
GRANT ALL ON DATABASE retail_db TO ROLE developers;
GRANT ROLE developers TO GROUP developers;

Once you are done with creating the roles, you can log in to beeline shell as one user who is in part of the developers group and check the access.

Share this post