Saturday, January 19, 2019

Database Administrator Interview Question - Intermediate Level


Level 3: The Riddler
“Riddle me this, riddle me that.” The ability to ask the right question can unlock as much information as you can handle, and in some cases more than you ever want to know. However, just asking questions isn’t enough, since in order to have questions answered, there must be data present in the first place. Being able to generate data and then know that it is safe for future use is vital, since you never know when some scrap of information might be useful in your next caper project.

  • What does ‘INSERT’ do?
INSERT submits data into a database as a new row, usually through the use of a form. While forms can take many...forms..., the most common uses are through either a dedicated application or through the use of an HTML form. Clicking on the ‘submit’ button will trigger the built in form reaction to scan the form for particular fields, making sure the required ones are entered correctly, make sure the user isn’t being naughty in what they are trying to enter, then submit the data to the database.
  • What does ‘DROP’ do?
DROP removes a table from a database or a database from a server. A very dangerous command indeed, it is only to be used in situations that absolutely require it, as unless you have a backup of it handy, there is no coming back from this.
  • What is the difference between T-SQL and PL/SQL?
T-SQL or Transact-SQL is Microsoft’s version of SQL. The main additions Microsoft made to the main branch of SQL involve the addition of procedures or routines — scripts essentially — that can be run under certain criteria. PL/SQL, on the other hand, is Oracle’s version of SQL, and conceptually the two are very similar. However, because of the nature in how they were developed, trying to move data from one to the other involves quite a bit of work. The main differences deal with how they multi-task and how they lock elements when they are in use.
  • What does ‘UPDATE’ do?
UPDATE allows values to be modified where they meet specific criteria. For example, say that you were on Amazon and were about to move. As a result, you would want to adjust your mailing address so that you actually got your stuff. You would therefore go into your settings and it would show you your current address. Modifying this address and then submitting the form would update your address based on your particular user profile. If it updated anybody else’s address to match that would be a serious problem — at least for the person doing the paying.
  • Why do database servers benefit from a lot of memory, and why do 64-bit operating systems help in this regard?
Database servers like to cache as much data as possible when they are reading it a lot. Storing this information in active memory is a lot faster than trying to find it again from the hard disk or other media. Therefore more memory = faster response time = better performance. The problem is that for most operating systems the maximum amount of memory that can be used by a 32-bit OS is 4 gigabytes. While in years past this would have been an inconceivable number, today it is a drop in the bucket. 64-bit operating systems resolve this issue by being able to handle memory to 192 gigabytes currently for Windows, while Linux can theoretically go much higher at present, and these numbers will only climb higher and higher.
  • Why is it a bad idea to run a test on a live database?
On a test database, it’s relatively easy to keep the performance variables to a minimum. On a live database however, it needs to be functioning for all users all the time. Running untested code on a production database can not only reduce performance, but also create unforeseen instability in the server itself — potentially causing crashes and data corruption.
  • Why is it difficult to use standard file by file backup methods on an active database server?
This problem is twofold. First, many database servers place locks on database files that are currently in use. Most backup programs that try to do a file-by-file backup will therefore be unable to create a copy of this file, as they cannot get exclusive permissions to it. Second, while some database servers have only a single file to backup a database, others have multiple files that can be stored in different locations across possibly multiple physical hard disks. The problem can be resolved in one of two potential ways. First, using the backup method within the database server itself. Some programs such as Microsoft SQL Server allow you to create a scheduled backup directly within the server application to a location of your choosing. Others require you to use a scheduled task or another on-demand type of backup solution. The second would be to use a backup application that can talk directly to the database server, allowing the database to be backed up using a different technique.

  • When would you use an offline backup method versus an online backup?
If the above methods are unavailable when trying to create a backup solution, another potential method is temporarily taking down the database or database server in order to create a file-by-file backup. The problem with this method is that if the server goes down incorrectly, the backups could be flagged as bad and thus unusable. Periodically testing your backups to make sure they are working properly is strongly recommended, regardless of what method you use to create them.

  • What is Replication?
Database replication allows for real-time automated backups between multiple database servers. This allows for the creation of either a fall-over server, or warm backup for use in case the main server goes down.

  • Is data in databases encrypted by default?
While most database servers support some form of encryption out of the box, it is not enabled by default due to performance hits and security concerns.
Level 4: The Librarian Data Organization is critical to being able to navigate through large numbers of tables and much larger quantities of data. Being able to look ahead to potential applications and create and optimize tables in a way that allows them to grow without having to rebuild them entirely can help save many hours of hassle. Bananas are still optional and still tasty.
  • What is the difference between a ‘TINYINT’, an ‘INT’ and a ‘BIGINT’?
Contrary to popular belief, this is not the lifecycle of a talking tree. Rather, it is the creation of a column that allows for specific levels of integers (numeric whole numbers) up to a specified cap. There are many ways to limit the growth of fields, but in the case of Microsoft SQL Server, these each represent a value in bytes, which creates a maximum value that the field can hold.
Data type Range Storage
bigint -2^63(-9,223,372,036,854,775,808)to2^63-1 (9,223,372,036,854,775,807) 8 Bytes
int -2^31 (-2,147,483,648) to 2^31-1 (2,147,483,647) 4 Bytes
tinyint 0 to 255 1 Byte

  • How would you store files within a database?
Two common ways to store files for use by a database are either within the operating system’s file system, or within a field of the table itself. Uploading and storing the files outside of the database makes for faster creation of the application, and can be more efficient if the file sizes are larger, but can potentially cause security issues if the files are not secured correctly. On the other hand, the files can also be stored directly within the database using a BLOB-type field. A BLOB is a Binary Large Object, essentially an empty area where a file can be uploaded to but not exceed a specified limit. Like int in the example above, blob has a number of different potential sizes, depending on the type used. Bear in mind there are other methods for storing and accessing files in a database server, these two are merely the most common.
  • When would you use ‘char’ versus ‘varchar’?
This is a bit of a difficult question, mostly because it depends so much on what your application is. For example, if you have a form field that can be nearly any length and changes every single time, then varchar is a much more practical choice, since it gives you much more flexibility. If however you have a field where every value is going to be exactly the same length, then you can get more efficient performance out of a char. Again, it depends on exactly what your application is, and how you plan to cook it — seasoning as you see fit.
  • What is XML?
Extensible Markup Language (XML) is a fast way to display data that not only conforms to a structure that can be read by machines, but is also easily understandable by humans. Because they can be dynamically and manually generated in many different ways, they are easy to produce and map to; and because they retain the same structure despite the data being updated, they can be relied upon for automatic functions such as RSS aggregation.
  • What shows that a database server is running?
<Insert joke about needing to catch it here/> Database servers run as services or daemons, most times in the background without the necessity to see that they are running in order to interact with them. When things go sideways however, being able to verify that the service is in fact up and running can be an excellent place to start troubleshooting. Checking under the services area of your particular operating system, whether that be by GUI or by CLI, can show you that the service is started or not, thus allowing you either to start or restart it as need be.
  • What is WYSIWYG?
What You See Is What You Get. A mouthful of an acronym, it allows for the creation of an application that is consistent regardless of how it is viewed — whether on the design screen, being viewed in a browser or being printed.
Creating an interface to a database that is not only functional but also looks nice is a trick in itself, and can take a lot of work to get it just right.
  • Why is it frowned upon to use ‘SELECT * ..’ in a large database?
Picture it like a group of people in line for a bathroom, and every single person that was going in there was going to use the toilet, change their clothes, take a shower, iron their jacket, take another shower, etc. There is only so much area that can be used efficiently before you start to get a queue, slowing down the whole operation that can eventually cause the entire thing to collapse under lack of toilet paper. You can quickly get back more than you can use or understand, so optimization is key when creating queries and asking only what you need to get the question answered.
  • How would you get the quantity of results from a query?
COUNT() is the main supported way to be able to get the number of returned results from a query. While there are many other options such as mysql_num_rows, these are considered obsolete and are being removed.
  • What is a Database Schema?
If you’ve ever seen one of those Visio diagrams with 40 different tables with lines connecting particular columns on one with those on another, that’s a database schema. Essentially a two-dimensional representation of how each table talks to other ones, it is the way to view the design of a database as a single entity and not as a jumble of different tables.
  • What are Nested Queries?
A query within a query, this particular method can be tremendously difficult to troubleshoot and even harder to manage without a lot of overhead. In most cases, a nested query can be replaced with a JOIN, allowing for much more efficient use of resources.

No comments:

Post a Comment