This paper was originally published in the
Proceedings for First World Conference on System
Administration, Networking and Security in Washington DC,
Policies and Procedures.
|Types of Security Incidents|
|Insider treats||70% to 80%|
|Physical treats||15% to 30%|
|Outsider treats||1% to 5%|
In addition, most inside incidents are mistakes:
|Types of Inside Security Incidents|
|Human error, accidents||50% to 60%|
|All others||20% to 30%|
Many companies have an open personal policy, and are almost viewing internal security as an evil thing. However, if reliable computing resources are necessary, internal security must be addressed. As the statistics shows, most incidents are stupid mistakes, often done by well meaning users without the necessary understanding to make good decision.
RFC 1244 gives some very good guidelines for implementing a security policy. The examples given here are therefore kept brief. Some examples of the content of a security policy would be:
It always seems to be a controversial question who has the root privileges. It can at time be very difficult to convince manager and users alike, that it is to the best for the site, when as few people as possible know this password. Examples of a few inventive methods which can help to counteract some requests for the root password (Unfortunately I no longer remember who contributed these suggestions):
I have actually used the last suggestion successful at a large company which had security guards on duty 24 hours a day. The intimidation factor worked so well, that the envelope was very seldom used.
Many UNIX people assume a disaster recovery plan is something for very large installations. Such disaster recovery plans includes standby computer centers, or similar other very expensive solutions, just waiting for the disaster to hit. However, even very small organizations will do well in by having a disaster recovery plan. Also, a disaster is often associated with large events, like an earthquake or a fire destroying the plant. However, much smaller events can prove to be a disaster, if they are not planned for in advance. Consider for example:
Good planning ahead of time can save much time, when the planned for event occurs, because the system administrator will know specifically what to do, and further the necessary resources will be available. For example, replacing a failed disk can be a very time consuming procedure. However, if a \fIformatted\fP spare drive is available on the site, the effected system will be able to get back online in a significantly shorter time. If the drive would first need to be formatted, this would add time to the replacement work. The tradeoff here is the investment in a spare drive, versus the time to recover after failure. The answers to such questions will be different for each organization, as the answer depends on available capital, cost of downtime, time pressure on projects, etc.
Procedures are as important to get documented as the policies, although they serve a very different purpose. A procedure is giving a step by step instruction for how a specific task must be executed. Well documented procedures mean more work can be done by less skilled people. One show example of this is that in many installation, workstations are installed and configured by system administrators. However, with a good procedure, workstations can easily be installed by an operator. This would leave the system administrator free to do work more adequate to that persons skill level. Similar, a system administrator should only need to get involved with backup and restores in a planning capacity, or in case problems occur. The actual tack of performing the task should be done by an operator. Many other tasks are often done by overskilled personal, because the procedure instead of being documented, is almost treated as a black art.
In addition to the advantage of being able to utilize people better, and providing a better work satisfaction, a well documented procedure is lending itself well to automation. Many attempts I have seen to automate simple procedures, as e.g. adding and deleting accounts, has failed because the procedure was not well understood prior to the start of the automation effort.
It is also important to ensure that the management is educated about the purpose and expected return of these tasks. I know of at least one company, where good procedures where implemented, and followed by a decision from upper management, stating that the system administrator was no longer needed. Upper management assumption proved to be right for the first six month. At that time the reliability of the systems started to deteriorated very fast. The cost of repairing the damages created by this shortsighted decision was significantly bigger than the cost would have been of keeping the system administrator during those six months.
As it can be seen, the corner stone of the process of introducing good policies and procedures is communication. During the process, many requirements for technical changes are sure to arise. It is extremely important to get cooperation from users by letting them know ahead of time of any upcoming changes. A very common and very reasonable complaint I have heard from the user community is that changes takes place without any advanced note. A system administrator does not always have the option of give an advanced notice However, this should be possible in the vast majority of cases. Nothing can be more frustrating for a user, than coming in on the weekend to catch up on a late project, to find the main file server down for upgrades.
Such notification can be done though e-mail, a small newsletter, \fI/etc/motd\fP or messages boards outside the computer room(s). However, an increasing number of companies are starting to have a user committees, who are used as the main point of contact to the user community. Such a committee can be used to clear plans for major changes and to ensure timing of such changes will have the smallest possible impact on the users work schedules. A small newsletter is extremely useful to broadcast such decisions to the user community at large.
It is very important to establish ways for the users to communicate problems and wishes to the system administrator in an informal manner. While most requests can and should be handled through the official procedures, it has been my experience, that it is also important to have an unofficial way where the users can express themselves, directly to the system administrator. One example of this has been reported by Max Vasilatos when she was working on creating the computing center at OSF. While there was a large amount of work which had to be done to build this installation, she found that the users where generally more satisfied, if she spend a few hours every day, just working the hallways, being available for small-talk, compared to when that time was used to solve prevailing problems.
While good communication is important, it will not be possible to solve all problems. The reasons for this could be the resource asked for is unavailable, or maybe that specific user is just a pain in the neck, and cannot be accommodated. Whatever the reason is to reject a users request, two things can help: First explain to the user why the rejection (policy, budget, time restraints, etc) is necessary, and encourage the user to escalates the problem to his/hers manager. Second, immediately let your own manager know what had occurred, and your reasoning for the rejection. I have found, that when this is done every time, the manager will get increased trust in the expressed judgment, and will seldom overrule it without good reason.
The same procedure should be used, when confronted with a problem which is not within the scope of the system administrators responsibility. The key is to keep the manager informed of ongoing events. Managers hate to get surprises, specially bad ones.
Even with all these methods employed, it is not always possible to get to satisfactory results. In such cases, it can be necessary to use CYA methods to ensure not to become a scapegoat at a later time.
In such cases, memos on special problems can be helpful if done right (that is professional and informative). However one of the best methods in my experience is to use weekly status reports, stating progress and upcoming plans. If a problems occurs, where it is impossible to get a solution to, placing it on the bottom of the status report and leave it there on the report every week until resolved.
One example of where this strategy paid off, where in one small startup company, which had 25 XENIX machines, and only two 40 Mbyte cartridge tape drives. With this equipment, it was impossible to perform a reasonable backup, and I requested a 9 track tape drive (which was the high capacity/high speed equipment of the time). Upper management decided that the $10.000 necessary was too expensive, and that I would have to do with existing equipment. Knowing that the current backup scheme was extremely insufficient, I started to put on my status report every week that the backup scheme was insufficient, and needed resolution. This continued for almost four month, in spite of many remarks from my manager that he already knew about the backup, and I needed not tell him every week.
However, after four month, the inevitable happened. One of the machines had a disk crash, and Murphy's law went into full effect immediately. First, the disk contained the software for a major release, announced to take place three days later; second the machine was scheduled for backup that night, making the last backup two weeks old; and on top, the last backup was impossible restore, requiring the previous backup be restored instead (this one was four weeks old). After the dust had settled, I could prove that I had warned against such a event on my status report for the past four month. If this had not been the case, I would properly have lost my job. Instead, because of the CYA effect, I got $10.000 to purchase a new tape drive.