DUKE ITAC - September 29, 2005 Minutes
September 29, 2005
Members present : Tracy Futhey, John Board, Mike Pickett, George Turner, Dalene Stangl, Owen Astrachan represented by Jeff Forbes, Caroline Nisbet, Paul Harrod represented by Alfred Trozzo, Dick Danner represented by Wayne Miller, Molly Tamarkin, Michael Gettes, Shailesh Chandrasekharan, Billy Herndon, Rafael Rodriquez, Ed Gomes, Lynne O'Brien represented by Jim Coble, Roger Loyd, George Oberlander, Pakis Bessias, Bob Newlin, Shiva Das
Guests: Kevin Miller, OIT; Chris Cramer, OIT; Rob Adams, MCIS; Ginny Cake, OIT
Start time : 4:05 p.m.
I. Review of Minutes and Announcements:
Tracy Futhey says we've hired a new person in OIT. His name is Tim Poe.
Michael Gettes says he's from UNC and has worked with audio and video services nationally. Many people look to UNC for a model of audio/video services. Tim starts on Monday.
Tracy says some of you may know this position has been coming for a long time after an ITAC subcommittee determined we should do this.
II. What processes does Duke need for improving backups? - John Board, Mike Pickett
John Board says we have had a couple of high profile events where backup systems didn't work properly, and it has awakened a fear that other backup issues are floating around campus. The steering committee has been wondering how ITAC can be most helpful in addressing this. It's clear there is no one-size-fits-all solution. It is also clear that this is an awkward time because of the state of technology: discs are cheap and big, tapes are expensive and slow. We have come up with a couple ideas of what ITAC can do.
Mike Pickett says we can use the Futures Forum. For those who aren't familiar with the Futures Forum, for the last 14 years the Forum has had half-day meetings on a particular technology in which a significant change has occurred and we see a need to respond to it as an institution. The intent is that if there is something we should pilot this year, what we typically try to do – and this is only Duke people – is bring in folks around Duke who have tried new things, seen them work, or seen things at other institutions and feel it's something we, as institution, should adopt.
Some questions to be posed to folks who probably have something to say on these topics: What are things that could go wrong? What tools and techniques can work to bring us back? How do we back up user files? What should be the difference between that and server files? Should we encrypt things that we backup? How do I know I'm really protected? How often do I check that out? What about offsite storage? How often do I backup and archive? How often should archives be shredded?
If there are people you think should be involved, send those suggestions to me. We're aiming for early November for the Futures Forum. We're trying to find a big enough place – we usually go to Searle Center where we can get enough people in the room to have a good discussion. After that, it might be timely to convene a short-term task subcommittee to think about best practices in late 2005 or early 2006.
Molly Tamarkin says I think the first question you need to ask is why are you backing up? Are you backing up to help back up a system when it crashes, or are doing it for a user who wants their data back up if it is lost. We need to recognize that the way to build backup may differ according to the answer to those questions. Also, customers may have a very different sense of what is good backup and how quickly it can be restored. What strikes me is how quickly people expect to restore something that was deleted within two hours of creation. [At the Nicholas School ] we have an archive policy and our own way of storing data offsite. It may be useful to hear what others are doing, but I feel that my boss and advisor committee need to weigh in on these kinds of decisions.
George Oberlander says many systems have gone through intense and undocumented system settings, and the back up system doesn't backup the settings. Sometimes it is critical to be able to restore the system settings and not just the data. Failure to do so may require very expensive and time-consuming reinstallations. It's much harder to be certain that you're getting good backups of system settings than of data files. Also, with our system one vendor checked the backup system, said everything was fine; we then went to another vendor who looked at it and found all sorts of problems.
John Board asks is there a sense that there even exists a set of best practices in backup?
Michael Gettes says not necessarily best practices, but good practices.
Molly Tamarkin says getting into the realm of contingency planning, what you need will vary, but the questions are the same. We have ways of mitigating risk that seem acceptable to us but not to the hospital.
Billy Herndon says for basic enterprise systems they have duplicate servers where can they failover. But it's a little different when you think of Hurricane Katrina and those kinds of things, if you're out of business with a primary and failover server, you have to have another site. There are different degrees for different purposes.
Rafael Rodriguez says if we look at what Molly is talking about, we need to look at it from a business perspective. I've found the biggest problems are around processes. All the way around, if we have a major disaster, how are we going to register people? The whole thing about business recovery is a lot broader than just the technology end of it.
John Board asks is it better to focus just the technical end, or are we ignoring that it is part of a bigger picture?
Rafael Rodriquez says it's two tracks.
Billy Herndon says we probably need to talk about backups, but there is a bigger thing about business continuity.
Michael Gettes says this topic is like most other things in IT: you have the functional and the technical requirements, and one keeps influencing the other until you get it right. The other thing to keep in mind is that it's not just what our processes are, but many of us have contracts with outside vendors. What are our contracts with them? What rights do we have to validate their processes and not just take their word that things are okay?
Ginny Cake says I think if you expand to business continuity you need to extend the discussion not just to IT personnel but to administrative people. There is another group to educate.
III. Paging Update - Rafael Rodriguez, Ginny Cake, Billy Herndon
Rafael Rodriquez says we went from a Duke Hospital to a Duke Health System, then got involved with Durham Regional and now Duke Health Raleigh. We have basically three different aging paging infrastructures at Duke campus. Duke University really has two infrastructures: one is owned by Duke that we use for “incident” paging used for special emergencies. The reason for that is the latency we need for code pages is higher than we can get from a vendor.
The paging system is in dire need of upgrades, and two concepts came to mind. One is the need for integrated, system-wide paging. Two is investing in our own system. It became evident that for the high-response paging we need to continue running our own system. For wide area paging, which includes North and South Carolina , we are going to go to an outside firm.
The two vendors previously used at Duke were both bought up by USA Mobility, so that was very convenient. We basically made the recommendation to go with them, and it has been accepted. We negotiated with the vendor so the same instruments can be used at all locations. We are going to be switching the frequency at Duke to the frequency we own at Durham Regional because it has a wider footprint. We're going to got with an outside vendor for what is called the wide area. There will be one device and one set of frequencies. We have to evaluate capacity. In getting ready for the hurricane in Texas , our current system was overwhelmed this past Friday. Nothing was getting through, so this is to avoid issues like that.
John Board says if the current system able to be overloaded by classic pages, have we done testing to see if it can get overloaded by different kinds of messaging?
Ginny Cake says the technical team was looking at that. It depends on the processor speed and everything else. We're also trying to put in place a monitoring system so if gets to a certain threshold it will send out a notification. With last week we didn't even know there was a problem until we start getting complaints about latency of messages.
George Oberlander says I'm surprised there isn't a local vendor that can do low-latency paging.
Rafael Rodriquez says we were basically saying we need to have a page go out within three seconds. When some of these codes go out you have to take action within a minute or the mortality rate increases significantly. Also, sometimes satellites go out. And vendors are not willing to take on the liability. It's pretty standard in the U.S. to have these handled locally in-house.
John Board says a lot of our IT staff carry pagers; are they completely separate?
Rafael Rodriquez says this will be for all of Duke, not just the Health System. It will be a single system for everything.
Ginny Cake says USA Mobility will enhance the strength of signal in all buildings.
John Board says what is the time schedule for going from three pagers to one?
Ginny Cake says that it should be within next 90 days that we roll all that out.
Bob Newlin says I carry cell phone and pager, is way to forward pages to a cell phone?
Ginny says you will have the ability to sign out a pager to your cell phone number, so when a person calls a pager it tells them to call the cell phone number. But actual forwarding is a little ways away.
Rafael says I think it is going to be the other way around - cell phones are going to replace your pager for notification. The biggest problem with cell phones and Blackberries is that signals are different and they create interference with other devices or, in some areas where we don't have a signal. In the hospital especially we rely on those signals being there all the time.
IV. Dorm network rate limiting update - Kevin Miller
Kevin Miller says ResNet rate limiting this year was changed a little bit. Actually, the guidelines have not changed at all. To recap, we have a five GB/day outbound limit. When a user exceeds that limit, they get a ticket, after five tickets they are dropped down to a rate-limited group. Limits only apply to traffic going off campus. What has changed is that over the summer we did some changes to the East Campus dormitories. Previously there was one large contiguous network, and we broke it down into five zones. We did a number of changes to group the IPs more logically. Four zones break down as we break down East Campus zoning. The fifth is the Bell Tower , and we adjusted the rate limits. Last year we had total rate limits, this year we did some redistribution. We recognized that some areas are larger than others and the rate limits didn't account for that.
Tracy Futhey says, just to clarify, you're trying to set an aggregate amount based on how many people are in that area, so if you are in a larger group you're not at a disadvantage.
Kevin Miller reviews the status of tickets issued to date.
What we've seen is about a 20 percent increase over last year's previous in both inbound and outbound traffic; similarly, from the summer to September we saw about a 25 percent jump. We have had peaks around 350 Mb with average utilization just below that; inbound and outbound are pretty similar. Internal to Duke, we are seeing a marked increase in utilization from the data center, and we believe a lot of this is backup traffic. Wednesday we saw a 500 Mb peak and an average running around 200. We're also seeing an increase in traffic on other routers. Some of this is backup traffic, and we're seeing peaks of a half GB on ones that serve the engineering libraries.
V. Strategic planning charge to ITAC - Tracy Futhey
Tracy says at our last meeting we distributed a copy of the strategic plan roll for ITAC, and we asked you all to take it away and have a look at it. We wanted to take a second to revisit the topic. To recap, what John Simon talked about when he was here about a month ago, ITAC will be of service as a strategic planning group. School plans will all come before ITAC, and we will have the opportunity to identify if issues have been missed or need to be developed further. There are working groups with broader themes, about 19 of them, covering topics like the undergraduate experience, instructional technology, and arts. ITAC can have one of two potential roles: either they can use the same system of checks and balances, or for those that are particularly relevant in terms of technology they can ask for direct ITAC interaction.
VI. Security of Website - Chris Cramer, Rob Adams
Chris Cramer says a few months back some bad things happened, and that led to a discussion of how handle such things. There are all sorts of websites around Duke, and they have all kinds of information on them. John Burness pulled a bunch of us into a room and asked us how to prevent this from happening again. We're interested in getting ITAC's feedback on what you think should be some kind of Web security policy, such as what do we cover and what are the requirements that we make. We're interested particularly in sites with personal and sensitive information. We're also interested in keeping an eye on sites that may have a significant Duke branding issue, like the OIT website. If the OIT website got compromised, it would not look good. Given that, what can we do to make sure sensitive information or lots of branding don't compromise security? We have the start of a list, but we wanted feedback early rather than later.
Molly Tamarkin says at the Nicholas School often web work is contracted without contact to our IT organization.
Chris Cramer says we did have this discussion. If we talked to everyone at Duke there would still be sites we don't know about it. Part of this process is so when locate something, we can say, “Here is the policy, this is what we need to be doing.” This document says the scope is everything, included vended sites.
Molly Tamarkin says point is that if want to secure websites, it must be shared with people in a way that they can understand that what they're contracting for may apply.
George Oberlander says we also have an incident response policy. People need to know they have to do something.
Chris Cramer says we're not currently tasked with the incident response, but others are putting that together. One of thing hoping to hear from faculty members is does this make sense, is this a reasonable thing to be doing?
Ed Gomes says what if someone links from a Duke site to Roadrunner site that contains sensitive Duke information?
Chris says if it's Duke information, then we care. Ultimately, we are responsible for it.
Ed says so, if it is linked from a Duke origin, we have a right to look at it.
George Oberlander asks suppose it is not linked?
Chris says that's an interesting question. I don't know how we'd stop you, but we'd stop you somehow. We do have to work on the educational component, but we need to have something out there to show people, “this is why your website can't be up.”
John Board asks how envision this working in practice?
Chris says I don't think we've gotten that far, but I don't want to say we're just going to sit passively around and not do anything proactive, but how that will work out is not clear yet.
Rafael says a lot of this stuff obviously is on the technology side, but others are part of something that could just as easily happen on paper. In the Health System, particularly the School of Medicine , we are being very careful. We recently redid the school's site, and there was more to this than people think about. Prior to this when you did a search of the Medical School on Google, it didn't even turn up on first ten pages. Now we turn up on number ten, so there is an art to this; we need to determine what we are trying to do. There is a difference from a personal webpage to what a school or department wants to do.
Tracy Futhey says Chris and Rob are on fairly short timeline. If we see nothing here to suggest they are moving in wrong direction, they should move ahead and put words to this. If anyone wants to be more intimately engaged in this, contact Chris.