Controlling Access on Large Drupal Sites
July 27th, 2006
Suppose you’ve got a large website, with hundreds of pages of content… maybe even thousands. Suppose also that you have several different user groups and several sections of content requiring varying levels of access. Site content is edited by and contributed to by non-techies, and furthermore, we don’t want them to have to worry about protecting each page they create… we just want them to publish and not have to think about access control.
This is a pretty tall order… but thanks to Drupal, and some great third party modules, it can be a piece of cake.
First things first, this is a Drupal tutorial… not neccessarily applicaple to other CMS systems. There are other great CMS’s out there, but Drupal is what I use for building large scale multi-user/community websites. That being said, the same principles could likely be applied in other setups.
You will need…
And of course, for these things to work, you will need an appropriate hosting environment such as: Apache, PHP, MySQL. Drupal doesn’t seem too fussy about versions… but I havn’t had good luck running Drupal on IIS.
As you can tell, this is straight forward path access control, very simple… you only really need the 2 third party modules.
The reason I thought it was even worth writing about at all is because of this… Path Auto module can destroy your site if you don’t know what you’re doing. Fortunately I already learned the hard way… so you don’t have to. If you are building a new site with drupal and you havn’t setup any path aliases yet, you can safely ignore this section. But, if you have taken the time to carefully construct some nice URL aliases for your drupal pages, OR, if you have intentionally NOT created aliases for some of your content… YOU MUST READ THIS!
Path Auto is like Find and Replace, only Deadlier
Anyone who has worked extensively with ‘find and replace’ features in software they use has experienced this: the tragic “OMG! I didn’t mean to find and replace that!!!!” But it’s too late… you did find and replace that whether you realized what you were doing or not… you did it and now it is not a matter of one un-do… but possibly hundreds of “ctrl-Z’s” later… and you might not have that many! WELL… Path Auto module for Drupal can be a lot worse! One word of warning that applies to pretty much anything you do with a drupal installation… BACKUP FIRST!!! Not just your Database, but your files aswell.
Simple Yet Powerful Access Control
Ok, warnings out of the way, down to business. Get Drupal setup and running smoothly if you havn’t done so already (if you’re never going to use Drupal then you probably don’t need to read this).
Download the Path Access and Path Auto modules for Drupal 4.7.x. Extract the folder for each module to the modules directory of Drupal ( and then upload if you havn’t done this directly on your server). Navigate your way to “?q=admin/modules” and turn on the new modules you’ve uploaded aswell as the Drupal core module “path”, which is required for these others to work.
Now, there are a lot of ways to setup access control on a drupal site, but I personally think this is the easiest method, especially if you are having non-techie people contribute content to the site. The path access module gives us an easy way to control access to a given path… and anything that comes after it by using the ‘*’ wildcard symbol, in Drupal, we can specify anything that might be appended to a given path. So say we want to make our forum content viewable to members only, we can control access to the forum root by blocking anonymous users from accessing anything at ‘forum’… and we can control anything that could get appended to that such as ‘/my-nice-day’, by using ‘forum*’ so that ‘forum/my-nice-day’ would be protected aswell. The problem is that Drupal doesn’t create URL aliases for content automatically, by default you have to fill that in yourself. There is no way we are going to get everyone in the forum to create URL’s for their entries with ‘forum/’ in front… we don’t want them to have to think about URLs at all… just the content.
Path Auto to the Rescue
Thankfully some brilliant developer out there realized that this would be an issue… both for the sake of nice meaningful URLs and also for the sake of controlling access based on URL paths. The thing is, Path Auto can do some pretty unexpected things if you’re not sure what you’re doing. To administer the settings for path auto, you will need to go to “?q=admin/settings/pathauto”. Here you can setup automatic alias generation for pretty much anything in your Drupal site, including custom content types if you have created any. The first section, ‘General Settings’, is where you can define what you would like to use as a separating character; the default is ‘_’, but I prefer ‘-’. Do NOT check off ‘Create Index Aliases’ unless you are absolutely sure that is what you want to do. If you have already setup a site structure with various sections and links based on aliases you have created, you probably do not want to enable this.
Under ‘Node Path Settings’, Path auto lets you setup a URL structure for each content type you have and a generic URL alias structure for any content types you don’t specify. At the bottom of these settings you will also notice that there is an option to ‘bulk update node paths’ with no aliases already defined. BE CAREFUL HERE! If you have explicitly NOT set URL path aliases for some of your content, such as images, you will be creating aliases for them now, even if you havn’t defined a pattern for them since the default is simply the title. If you want to exclude certain content types you should remove the default that is applied to any blank patterns. If you generate a whole pile of aliases for a content type you didn’t mean to it’s not the end of the world, but it can be a problem because then those URL aliases are taken, used up, and you will need to delete them to get them back. This is where Path Auto can be worse than Find and Replace, because there is NO UNDO! You cannot batch process URL aliases on Drupal… that I know of… the fastest way to go through and delete them is to go to manage content and filter the view so you are looking at the just the problem content type, then you have to go through and manually remove the URL aliases from each of nodes accidentally affected by Path Auto. Before you do that though, make sure you’ve updated path auto settings so that the default alias structure is blank.
Protecting your Path
So now you’ve gone through all this trouble and you’ve setup your drupal site so that everytime anyone creates new content (say its forum posts), the post node gets an alias like ‘forum/my-nice-day’, rather than the oh so informative ‘node/87′. The benefit of this is that now we can automatically control access to our entire forum simply by going to ‘admin/access’ > ‘URLS’ and adding ‘forum*’ to the list of paths that ‘anonymous’ user cannot access. Viola! Nobody needs to worry about forum intruders, and nobody needs to worry about creating URLs. Sweet.
You can follow any responses to this entry through the RSS 2.0 feed. You can leave a response, or trackback from your own site.