Speaking on ETL Security

I will be giving a presentation on ETL (Extract, Transform, Load) security at two user groups in the coming weeks.

Securing the ETL Pipeline

We’re going to look at typical ETL (Extract, Transform, Load) pipelines and consider the weak points an attacker might go after. Our goal in this isn’t to cause FUD (Fear, Uncertainty, and Doubt), but to discuss risks at each point, options for protecting the vulnerability, and what we’ve seen typically done (if anything). While this talk primarily focuses on Microsoft SQL Server, especially the database engine and SSIS, many of the points covered will be applicable to any solution set.

If you’re in or near the Midlands or Upstate of SC, I’d love for you to come out so we can meet and discuss this topic, professional development, and SQL Server in general:

August 5, 2014 – SQL Server Innovators Guild – Greenville, SC

August 14, 2014 – Midlands PASS Chapter – Columbia, SC

Midlands PASS July Meeting – July 10

The Midlands PASS Chapter will hold its next meeting on July 10. We meet at MicroStaff IT in Cayce, SC. Here is the main presentation:

Statistics, Indexes, and their Impact

Speaker: Brian Kelley, SQL Server MVP

Statistics. Indexes. Clustered Indexes. Non-Clustered Indexes. Covering Indexes. Bookmark Lookups. Perhaps you’ve heard these terms. They determine how well or poorly your queries run. In this session, we’ll look at what these things are, how they impact your queries, what to do to maximize their use, and when you should consider making changes.

This is a 100-200 level presentation.

You can RSVP for the meeting (it helps us plan for food) at the EventBrite Event Page for this meeting.

 

PASS Summit Session Selection

Let’s make it democratic. Let’s ensure we get solid sessions from key people. And let’s save a ton of work in the process.

Spotlight Sessions:

There are certain folks that are extremely knowledgeable in their areas of expertise. They also happen to be excellent presenters. Have the spotlight sessions and invite them to present a talk. Limit the number of these, obviously. However, this ensures top speakers are presenting.

Let the Community vote:

Since folks had to update their community profiles in order to participate in the voting, let’s go down that road, except for session selection. It doesn’t matter if it’s a pre-con, a regular session, or a lightning talk. Put the abstracts up, complete with who is giving them, and give the community a chance to vote on a particular number. Perhaps for each track you get to vote for your top 10.

Does this skew things in favor of those who are more popularly known? Yes. But it also means the community is seeing who they want. So what about those who don’t have as solid a reputation? Let them build a reputation via the following:

  • local user groups
  • SQL Saturdays
  • virtual chapters

That’s what’s effectively being done by having a speaker rating score, anyway.

Set the deadline, tabulate the votes, and then take the top presentations per track and schedule them.

What if there’s a tie? Use a random number generator to make the selection.

But what if there are too many tracks?

Limit the number of submissions. Perhaps:

  • 1 pre-con
  • 2 regular session talks
  • 2 lightning session talks

Speakers try to game the system today because they don’t know what the selection committee for the Summit or for a particular SQL Saturday will want. So they submit more sessions than they actually want to give. Limit the # of submissions. What about panels? If the panel discussion is that important to you, then it takes one of your slots. No apologies, because you know that ahead of time. This causes a speaker to focus on what topics he or she really want to speak on and think will go over with the audience.

Will there be issues?

Of course there will. But this is more transparent than having selection committees behind the scenes. PASS, after all, is a community organization. It also eliminates any board influence (and there has been board influence in the past). So let’s keep this simple.

But that’s not how XYZ Conference does it!

No, it’s not. But XYZ Conference is probably not run by a community organization. If it is, perhaps they should follow the same model.

What is an “operational” DBA?

On Facebook last night, I posted the following:

An operational DBA isn’t just a manager of a traditional RDBMS, transactional system. An operational DBA manages the data platform, whatever it is, when it hits production. Their goals are not traditionally the same as someone focused on development. They are looking to keep the system secure, performing well, and ensure it is recoverable in case of failure/disaster. This could be over a traditional system, a no-SQL solution, a warehousing solution built around star/snowflake schemas, or anything that has to do with data storage, management, and retrieval. If you think an operational DBA is a manager of just a transactional system, please update your definition. Operations has changed greatly in the last decade to keep pace with development and business.

The reason I posted this is because I saw some indications that there is a misunderstanding of what an operational DBA is nowadays. Not surprisingly, this misunderstanding came from folks who aren’t in an operational role. Here’s the gist of what an operational DBA is typically concerned with:

  • Production security
  • Production performance / reliability
  • Production recovery

I’m not ranking those in order of importance. The order of those three items depends on the environment. I’ve been in environments where security was more important performance / reliability or recovery (military). I’ve also been in environments where performance /reliability was king.

Note that I didn’t include a particular type of platform. That’s on purpose. I also didn’t indicate whether or not production was on premise or hosted. That’s also on purpose. If you’re still thinking that an operational DBA is only concerned with the care and feeding of an on-premise transactional SQL Server, Oracle, DB2, MySQL/MariaDB, etc. platform, it’s time to update your thinking. An operational DBA takes over when a system transfers from Dev/QA/UAT to production, when it becomes operational (hence the name). Platform, type, where it’s hosted, and those types of details are irrelevant if the system is about data management and it just rolled to production. While there are some operational DBAs who solely focus on transactional systems, that’s not true of all operational DBAs. And that’s not what we operational types mean when we say, “operations.”

“A good DBA is a lazy DBA”

I’m borrowing from Andy Leonard (blog | twitter) who says all the time, “Good engineers are lazy.”

If you’re thinking, “Why would I want (to be) a lazy DBA?” let me explain. There’s a lot to be said for hard work. However, have you ever seen someone who is always busy but seems to get very little done? Hard work, in and of itself, isn’t the goal. The goal is to get things done. This is where laziness comes in.

If I have to repeat a task, I should look at automating it. I don’t want to have to repeat those steps each time. I want to be lazy. For instance, as an operational person, there are a lot of things I need to review within my production environment on a periodic basis to ensure my systems are running as they should. Case in point: ensuring the SQL Server drives don’t run out of free space. This is something that I should monitor regularly (daily).  There are different ways I could handle this:

  • I could log on to each server and check each drive.
  • I could use a tool like TreeSize to hit the drives one by one.
  • I could automate the checks which results in a report in my inbox.

I prefer the last option. If I’m smart, I’ll even schedule it to run so it’s ready for me when I get in each morning. But why stop there? I could not only automate gathering the data, but also automate some of the analysis. Let’s say I set thresholds and the automation bubbles up anything crossing a threshold to the top of my report, meaning the most important data gets seen first. I don’t want to just be lazy, I want to be very lazy. By being this lazy and automating, I free up the one resource I can never get more of: time.

What can you automate? Anything and everything you can successfully automate frees up time for you to spend tackling other things. The more your organization sees and understands that you do, the more valuable you are. If you are in IT but don’t happen to be a DBA, this is still a solid approach. Let me generalize and say being a lazy IT pro is being a good IT pro.

 

Service Packs coming for SQL Server 2008/2008 R2

I’m not prophetic, I promise. However, some good news on the service pack front with regards to SQL Server 2008 and 2008 R2. There have been rumblings about a last service pack for these versions of SQL Server for a while, but nothing official had been said. However, an official announcement was made yesterday afternoon:

“We are planning to ship one last Service Pack for both SQL Server 2008 and SQL Server 2008 R2. Because of the maturity of SQL Server 2008 and 2008 R2, these Service Pack(s) will be an exception in terms of timing and will ship after mainstream support of these releases ends on July 8th 2014.”

The good news is that we will be seeing one final service pack for both SQL Server 2008 and SQL Server 2008 R2. If you’re not familiar with the support dates, you can see them at the Microsoft Product Lifecycle site which includes a search by product. For instance, you can see the support dates for all the versions of SQL Server from SQL Server 2000 on.

 

Do you apply SQL Server Cumulative Updates?

I think Steve Jones makes a great point here with respect to cumulative updates:

“This is one reason I’ve been hesitant to remain current with Cumulative Updates (CUs). Microsoft doesn’t stand behind them, with the text on each CU page that users should only apply the patch if they are experiencing specific problems. Otherwise users are told to wait for the next Service Pack, which seem to be coming less and less often.”

When you look at the fact that service packs for SQL Server (and most Microsoft products) have been few and far between, this presents a problem. There aren’t a lot of bug fixes for SQL Server specifically, but there are important ones likes ones to fix data corruption, inaccurate result sets, and an infinite loop condition against certain dynamic management views. However, if you consider applying a cumulative update, here’s the text Steve was referring to:

“A supported cumulative update package is now available from Microsoft. However, it is intended to correct only the problems that are described in this article. Apply it only to systems that are experiencing these specific problems. This cumulative update package may receive additional testing. Therefore, if you are not severely affected by any of these problems, we recommend that you wait for the next SQL Server 2008 R2 service pack that contains the hotfixes in this cumulative update package. “

When was the last time a service pack released for SQL Server 2008 R2 (just taking one supported version)? It was July 26, 2012. In other words, we’re approaching the two year point. Therefore, is it wise to wait for that service pack? According to Microsoft, you should unless you are “severely affected.” However, what is meant by “severely?” If I don’t get accurate result sets back because I implement a FULL JOIN with CROSS APPLY, that’s a problem. If I have data corruption because LOB data, that’s a problem. If I try to query what’s executing and lock up my SQL Server, that’s a problem. In my view, all three of those qualify as severely. Great, but will I get the kind of support I should? If I take that text at face value, it basically says, “Installer beware.” That’s a terrible position to be in as a customer.

Which leads me to the conclusion that either (a) Microsoft should step up support on the cumulative updates and reflect this in their language or (b) Microsoft should release service packs more regularly. I don’t foresee either happening in the near future, but as a customer, I believe it’s a reasonable request.

Previous Older Entries

Follow

Get every new post delivered to your Inbox.

Join 3,836 other followers