Click here to Skip to main content
15,880,469 members
Please Sign up or sign in to vote.
0.00/5 (No votes)
See more:
Could somebody please help me with this query :).

We use Impala to query data, with Sentry to restrict access to data at column level.

We use Spark to write code to query data stored in files. My understanding is that Sentry roles cannot control access at column level when used with Spark. However, it has been suggested that there is a way to use Spark with Impala to allow code to be written to access data via Spark but still apply Sentry roles to control access at column level. Is this correct because I can't find any information on this anywhere.

What I have tried:

This is a theoretical question at the moment, I have tried searching for information but can't find anything.
Posted
Updated 5-May-19 23:45pm

1 solution

Impala and Spark are two separate SQL engines for use with Hadoop... One can not use features from the other!!!
So, no if you use Impala there is no Spark, if you use Spark there is no Impala...
 
Share this answer
 
Comments
Jackie Lloyd 7-May-19 3:54am    
Thank you very much for replying. Do you know if there is any way to control data access at column or row level when accessing data via Spark?
Kornfeld Eliyahu Peter 7-May-19 3:56am    
Never done it but easily find articles that are fit - at least on the surface...
Do some Google search...
https://hortonworks.com/blog/row-column-level-control-apache-spark/
Jackie Lloyd 7-May-19 5:12am    
That's great, thank you, I had looked for an article like this but hadn't found one, must have been searching with wrong key words, so assumed that column level access control is not possible with Spark.

This content, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)



CodeProject, 20 Bay Street, 11th Floor Toronto, Ontario, Canada M5J 2N8 +1 (416) 849-8900