
The most dangerous sentence in modern analytics is not that the business has too much data. It is that everyone needs access.
That idea sounds collaborative, even progressive. In practice, it is often how organisations turn a useful analytics platform into a quiet governance failure. The least privilege is defined as restricting users and processes to the minimum authorisations and resources needed to perform their function. Its zero-trust guidance makes the same point in broader form, arguing that access decisions should be accurate, least privilege, and made as though the network is already compromised. Proper identity and access management is critical to securing cloud resources, and access control policies should be carefully configured so users receive only the least privilege necessary.
That matters because a data lake is not risky simply because it holds a great deal of information. It becomes risky when access design lags behind platform ambition. The aggregation of critical data makes cloud services attractive targets for adversaries. In other words, the lake itself is not the story. The trust model around it is.
The real failure is rarely storage
This distinction is important because many analytics estates are still run with a mindset inherited from file shares and shared drives. Teams create broad access groups because it is operationally convenient. Engineers grant wide permissions because deadlines are real. Business users are told to work inside a common zone because it speeds up adoption. For a while, this feels efficient. Then the platform expands. Finance wants customer level granularity. Operations wants plant data. Trading wants market and position signals. Sustainability teams want emissions views. External partners want extracts. Suddenly the lake is serving half the enterprise, but the permissions model still behaves as though it is a team folder with a better user interface.
That is where the trouble begins. The technology scales faster than the trust design.
Why shared folders and hope break at scale
Shared access works only while the business is small enough for trust to be social rather than architectural. Once the platform becomes important, informal trust stops being sufficient.
Also Read: Data minimisation vs AI context maximisation: The battle defining the future of smart systems
The leading cloud data platforms have already moved beyond this. AWS Lake Formation is built around central governance, with fine-grained access controls that can restrict access at the database, table, column, row, and even cell level, with audit history across services. Databricks makes a similar shift in Unity Catalogue, where access is layered through workspace restrictions, explicit privileges and ownership, attribute-based policies, row filters, and column masking. The significance of this is not vendor marketing. It is the market admitting that broad shared access does not survive real scale. Modern analytics platforms increasingly need access to be designed as a first-class product capability.
What product grade access design actually looks like
Product grade access design starts with the idea that access is part of the user experience, not an afterthought for the security team. If a data product is meant for operations managers, field engineers, finance partners, and external contractors, then each of those audiences should encounter a deliberately shaped version of the product. They should not all land on the same raw surface and rely on restraint.
The first requirement is explicit ownership. Every securable object should have an owner, and access is allowed only when the relevant privileges have been granted. That sounds basic, but it changes behaviour. A platform with named owners forces someone to be accountable for who gets access and why. A platform without clear ownership drifts into inherited permissions and quiet overexposure.
The second requirement is policy at the data level, not only at the folder or environment level. AWS Lake Formation’s model of row, column, and cell-level control, and Databricks’ use of tag-based policies, row filters, and masks, point in the same direction. The future of lake governance is not coarse access to broad zones. It is context-aware access that follows the sensitivity and purpose of the data itself. That is especially important in sectors like energy and industrials, where commercial, operational, maintenance, and customer information increasingly sit in the same analytical estate.
The third requirement is environment separation, which actually means something. Databricks documents workspace restrictions that can isolate production data to production workspaces, even where a user may hold wider privileges elsewhere. This is an important lesson. In too many organisations, development, experimentation, and production are separated in slides but blurred in practice. Product grade access design makes the boundary enforceable.
The fourth requirement is auditability that supports management, not just forensics. AWS provides comprehensive audit logs through CloudTrail for data access attempts across services. This is not just about catching intruders. It is about allowing a platform owner to answer a basic leadership question with confidence: who accessed what, when, through which service, and under which policy.
Why this matters more in the age of AI and self-service
The old permissions model was already weak. AI and self-service analytics make it weaker.
Also Read: Server sanctuaries or net-zero derailers? Southeast Asia’s data centre dilemma
Every new agent, notebook, model training job, dashboard layer, and external share increases the number of identities acting on data. NIST’s definition of least privilege explicitly applies not only to users but also to processes acting on their behalf. That is a useful reminder, because many organisations are still good at reviewing human access and poor at governing service accounts, pipelines, automated jobs, and data science workflows with the same discipline.
This is where the phrase product grade becomes especially useful. Product teams know that scale does not come from more manual approvals. It comes from designing good defaults, clear roles, bounded entitlements, observable behaviour, and predictable escalation paths. Analytics platforms need the same thinking. A mature platform should make the secure path the easy path. If getting the right access is slower than getting broad access, the broad access will win every time.
The mistake leaders keep making
Too many executives still treat access control as a technical hygiene issue. It is not. It is one of the main determinants of whether a data platform can become a trusted enterprise product.
If permissions are too loose, the organisation eventually suffers a data exposure, a partner trust issue, an internal credibility problem, or a regulator’s question it cannot answer cleanly. If permissions are too rigid and badly designed, the platform becomes a bottleneck and the business routes around it. The winning position sits in the middle. Tight enough to be credible, usable enough to support real work.
That is why this is not mainly a storage conversation. It is a product and operating model conversation. The leading platforms have already evolved toward central governance, fine-grained controls, attribute-based access, audit trails, and explicit ownership because the old approach does not survive enterprise scale. The organisations that still rely on shared folders with better branding are not simplifying access. They are postponing a more serious design decision.
—
Editor’s note: e27 aims to foster thought leadership by publishing views from the community. You can also share your perspective by submitting an article, video, podcast, or infographic.
The views expressed in this article are those of the author and do not necessarily reflect the official policy or position of e27.
Join us on WhatsApp, Instagram, Facebook, X, and LinkedIn to stay connected.
The post Data lakes do not leak, permissions do appeared first on e27.
