978-0735666054 pdf download






















Please read our short guide how to send a book to Kindle. The file will be sent to your email address. It may take up to minutes before you receive it. The file will be sent to your Kindle account. It may takes up to minutes before you received it. The focus of this lesson is logical query processing—the conceptual interpretation of the Key query that defines the correct result.

In this language, declarative means you define what you want, as opposed to imperative languages that define also how to achieve what you want. For this reason, it is important not to draw any performance-related conclusions from what you learn about logical query processing.

When addressing performance aspects of the query, you need to understand how optimization works. Consider the following request in T-SQL. Now try to think of the order in which the request needs to be logically interpreted. For example, how would you define the instructions to a robot instead of a human? Therefore, contrary to the keyed-in order of the previous query, the logical query processing has to be as follows. FROM Sales. Of course, things can get more com- plex. If you understand the concept of logical query processing well, you will be able to ex- plain many things about the way the language behaves—things that are very hard to explain otherwise.

Logical Query Processing Phases This section covers logical query processing and the phases involved. Subsequent chapters in this Training Kit provide more detail, and after you go over those, this topic should make more sense. To make sure you really understand these concepts, make a first pass over the topic now and then revisit it later after going over Chapters 2 through 5. FROM 3. WHERE 4. It starts with the FROM clause. Here is the logical query processing order of the six main query clauses: 1.

FROM 2. The output table of one phase is considered the input to the next phase. This is in accord with operations on relations that yield a relation. It filters only employees that were hired in or after the year It groups the remaining employees by country and the hire year. It keeps only groups with more than one employee. For each qualifying group, the query returns the hire year and count of employees, sorted by country and hire year, in de- scending order.

The following sections provide a brief description of what happens in each phase accord- ing to logical query processing. If you need to query just one table, you indicate the table name as the input table in this clause.

Then, the output of this phase is a table result with all rows from the input table. Employees table nine rows , and the output is a table result with all nine rows only a subset of the attributes are shown. Only rows for which the predicate evaluates to true are returned. Six rows are returned from this phase and are provided as input to the next one. Msg , Level 16, State 1, Line 3 Invalid column name 'yearhired'.

It then associates each input row to its respective group. Within the six rows in the input table, this step identifies four groups. Here are the groups and the detail rows that are associated with them redundant information removed for purposes of illustration. The final result of this query has one row representing each group unless filtered out.

Therefore, expressions in all phases that take place after the current grouping phase are somewhat limited. All expressions processed in subsequent phases must guarantee a single value per group. If you refer to an element from the GROUP BY list for example, country , you already have such a guarantee, so such a reference is allowed.

Filter Rows Based on the HAVING Clause This phase is also responsible for filtering data based on a predicate, but it is evaluated after the data has been grouped; hence, it is evaluated per group and filters groups as a whole. As is usual in T-SQL, the filtering predicate can evaluate to true, false, or unknown. Only groups for which the predicate evaluates to true are returned from this phase. If you look at the number of rows that were associated with each group in the previous step, you will notice that only the groups UK, and UK, qualify.

Hence, the result of this phase has the following remaining groups, shown here with their associated detail rows. This phase includes two main steps. This includes assigning attributes with names if they are derived from expressions. Remember that if a query is a grouped query, each group is represented by a single row in the result. Therefore, this step generates two rows. Employees; This query generates the following error.

Msg , Level 16, State 1, Line 1 Invalid column name 'yearhired'. Note the use of the word conceptually. This behavior is different than many other programming languages where expressions usually get evaluated in a left-to- right order, making a result produced in one expression visible to the one that appears to its right. But T-SQL is different. Because all expressions that appear in the same logical query processing phase are evaluated conceptually at the same point in time.

This phase is responsible for returning the result in a specific presentation order according to the expressions that appear in the ORDER BY list. The query indicates that the result rows should be ordered first by country in ascending order by default , and then by yearhired, descending, yielding the following output. The result of this phase is what standard SQL calls a cursor. Note that the use of the term cursor here is conceptual. T-SQL also supports an object called a cursor that is defined based on a result of a query, and that allows fetching rows one at a time in a specified order.

You might care about returning the result of a query in a specific order for presentation purposes or if the caller needs to consume the result in that manner through some cursor mechanism that fetches the rows one at a time. If you need to process the query result in a relational manner—for example, define a table expression like a view based on the query details later in Chapter 4 —the result will need to be relational.

Also, sorting data can add cost to the query processing. If it does, the same ORDER BY clause that is normally used to define presentation ordering also defines which rows to filter for these options. You are provided with instructions on how to fix the query. Try to figure out why the query failed and what needs to be revised so that it would return the desired result.

There are multiple possible orderid values per customer. To fix the query, you need to apply an aggregate function to the orderid attribute. The task is to return the maximum orderid value per customer. Therefore, the aggregate function should be MAX. Your query should look like the following. As in the first exercise, you are provided with instructions on how to fix the query.

Clear the query window, type the following query, and execute it. Try to identify the problem in the query.

The query filters individual orders with a freight value greater than 20,, and there are none. To correct the query, you need to apply the filter per each shipper group—not per each order. You need to filter the total of all freight values per shipper.

You try to fix the problem by using the following query. Try to identify why it fails and what needs to be revised to achieve the desired result. Which of the following correctly represents the logical query processing order of the various query clauses? Which of the following is invalid? It is relational as long as other relational requirements are met.

It cannot have duplicates. The order of the rows in the output is guaranteed to be the same as the insertion order. The order of the rows in the output is guaranteed to be the same as that of the clustered index. Case Scenario 1: Importance of Theory You and a colleague on your team get into a discussion about the importance of understand- ing the theoretical foundations of T-SQL.

Answer the following questions posed to you by your colleague: 1. Can you give an example for an element from set theory that can improve your under- standing of T-SQL? Can you explain why understanding the relational model is important for people who write T-SQL code? Case Scenario 2: Interviewing for a Code Reviewer Position You are interviewed for a position as a code reviewer to help improve code quality.

The queries have numerous problems, including logical bugs. Your interviewer poses a number of questions and asks for a concise answer of a few sentences to each question. Answer the following questions ad- dressed to you by your interviewer: 1. Is it important to use standard code when possible, and why? Is that a bad practice, and if so why? Review code samples in the T-SQL threads.

Try to identify cases where nonrelational elements are used; if you find such cases, identify what needs to be revised to make them relational. Provide a brief paragraph summarizing what happens in each step. Lesson 1 1. Correct Answers: B and D A. Incorrect: It is important to use standard code. Correct: Use of standard code makes it easier to port code between platforms because fewer revisions are required. Correct: When using standard code, you can adapt to a new environment more easily because standard code elements look similar in the different platforms.

Correct Answer: D A. Incorrect: A relation has a header with a set of attributes, and tuples of the rela- tion have the same heading.

A set has no order, so ordinal positions do not have meaning and constitute a violation of the relational model. You should refer to attributes by their name. Incorrect: A query is supposed to return a relation. A relation has a body with a set of tuples. A set has no duplicates. Returning duplicate rows is a violation of the relational model. Correct: Because attributes are supposed to be identified by name, ensuring that all attributes have names is relational, and hence not a violation of the relational model.

Correct Answer: B A. Correct Answer: C and D A. Incorrect: T-SQL allows grouping by an expression. Correct Answer: A A. Even though T-SQL is based on the relational model, it deviates from it in a number of ways.

But it gives you enough tools that if you understand the relational model, you can write in a relational way. Following the relational model helps you write code more correctly. Case Scenario 2 1. It is important to use standard SQL code. From a relational per- spective, you are supposed to refer to attributes by name, and not by ordinal position. The order should be considered arbitrary.

You also notice that the interviewer used the incorrect term record instead of row. You might want to mention something about this, because the interviewer may have done so on purpose to test you.

From a pure relational perspective, this actually could be valid, and perhaps even recommended. But from a practical perspective, there is the chance that SQL Server will try to remove duplicates even when there are none, and this will incur extra cost.

I t is hard to imagine searching for something on the web without modern search engines like Bing or Google. However, most contemporary applications still limit users to exact searches only. In addition, many documents are stored in modern databases; end users would probably like to have powerful search capabilities inside document con- tents as well.

Microsoft SQL Server enhances the full-text search support that was substantially available in previous editions. This chapter explains how to use full-text search and even semantic search inside a SQL Server database.

Before you start Key using full-text predicates and functions, you must create full-text indexes inside full-text cata- Terms logs. Estimated lesson time: 60 minutes Full-Text Search Components In order to start using full-text search, you have to understand full-text components. For a start, you can check whether Full-Text Search is installed by using the following query. Besides using full-text indexes on SQL Server character data, you can store whole documents in binary or XML columns, and use full-text queries on those documents.

You need appropriate filters for documents. Filters, called ifilters in full-text terminology, extract the textual information and remove formatting from the documents. You can check which filters are installed in your instance by using the following query.

Therefore, only the irst and third que- ries are completely equivalent. You should prefer an explicit namespace deinition to using the default element namespace. The queries used a relative path to ind the Customer element.

Before looking at all the different ways of navigation in XQuery, you should irst read through the most important XQuery data types and functions, described in the following two sections. You already know about SQL Server types. This section lists only the most important ones, without going into details about them. XQuery data types are divided into node types and atomic types.

The node types include attribute, comment, element, namespace, text, processing-instruction, and document- node. The most important atomic types you might use in queries are xs:boolean, xs:string, xs:QName, xs:date, xs:time, xs:datetime, xs:loat, xs:double, xs:decimal, and xs:integer. You should just do a quick review of this much-shortened list. The important thing to understand is that XQuery has its own type system, that it has all of the commonly used types you would expect, and that you can use speciic functions on speciic types only.

Therefore, it is time to introduce a couple of important XQuery functions. They are organized into multiple categories. The data function, used earlier in the chapter, is a data accessor function. The following query uses the aggregate functions count and max to retrieve informa- tion about orders for each customer in an XML document. For now, treat this query as an example of how you can use aggregate functions in XQuery.

The result of this query is as follows. Actually, there is not enough space in this book to fully describe all possibilities of XQuery navigation; you have to realize this is far from a complete treatment of the topic. The basic approach is to use XPath expressions. With XQuery, you can specify a path absolutely or relatively from the current node.

XQuery takes care of the current position in the document; this means that you can refer to a path relatively, starting from the current node, to which you navigated through a previous path expression.

Every path consists of a sequence of steps, listed from left to right. A complete path might take the following form. In the second step you can see in detail from which parts a step can be constructed. In the example, the axis is child::, which speciies child nodes of the node from the previous step. In the example, element-name is the node test; it selects only nodes named element-name.

Note that in the predicate example, there is a reference to the attribute:: axis; the at sign is an abbreviation for the axis attribute This looks a bit confusing; it might help if you think of navigation in an XML document in four directions: up in the hierarchy , down in the hierarchy , here in current node , and right in the current context level, to ind attributes.

Table describes the axes supported in SQL Server. This is the child:: default axis; you can omit it. Direction is down. Direction is here. Direction is here and then down. Direction is right. Retrieves the parent of the context node. Direction is up. A node test can be as simple as a name test. Specifying a name means that you want nodes with that name.

You can also use wildcards. A principal node is the default node kind for an axis. The principal node is an attribute if the axis is attribute::, and it is an element for all other axes. You can also narrow down wildcard searches.

You can also perform node kind tests, which help you query nodes that are not principal nodes. Numeric predicates simply select nodes by position. You include them in brackets. You can also use parentheses to apply a numeric predicate to the entire result of a path. Boolean predicates select all nodes for which the predicate evaluates to true. XQuery sup- ports logical and and or operators.

However, you might be surprised by how comparison op- erators work. They work on both atomic values and sequences. For sequences, if one atomic value in a sequence leads to a true exit of the expression, the whole expression is evaluated to true. Look at the following example. The second evaluates to false because none of the atomic values from the irst sequence is less than any of the values from the second sequence.

The third expression is true because there is an atomic value in the sequence on the left that is equal to the atomic value on the right. The fourth expression is true because there is an atomic value in the sequence on the left that is not equal to the atomic value on the right. Interesting result, right? If this confuses you, use the value comparison opera- tors.

The familiar symbolic operators in the preceding example are called general comparison operators in XQuery. Value comparison operators do not work on sequences, they work on singletons. The following example shows usage of value comparison operators.

Table lists the general comparison operators and their value comparison operator counterparts. It is more like a function that evaluates a logical expression parameter and returns one expression or another depending on the value of the logical expression. FLWOR is the acronym for for, let, where, order by, and return. You can use it to iterate through a sequence returned by an XPath expression. Although you typi- cally iterate through a sequence of nodes, you can use FLWOR expressions to iterate through any sequence.

You can limit the nodes to be processed with a predicate, sort the nodes, and format the returned XML. Input sequences are either sequences of nodes or sequences of atomic values. You create atomic value sequences by using literals or functions. The expression used for an assignment can return a sequence of nodes or a sequence of atomic values.

You control the order based on atomic values. With this clause, you format the resulting XML. The where clause limits the Order nodes processed to those with an orderid attribute smaller than The expression passed to the order by clause must return values of a type compatible with the gt XQuery operator.

The query orders the XML returned by the orderdate element. Although there is a single orderdate ele- ment per order, XQuery does not know this, and it considers orderdate to be a sequence, not an atomic value. The numeric predicate speciies the irst orderdate element of an order as the value to order by.

Without this numeric predicate, you would get an error. It converts the orderid attribute to an ele- ment by creating the element manually and extracting only the value of the attribute with the data function. It returns the orderdate element as well, and wraps both in the Order- orderid-element element. Note the braces around the expressions that extract the value of the orderid element and the orderdate element.

XQuery evaluates expressions in braces; without braces, everything would be treated as a string literal and returned as such. This expression repeats twice in the query, in the order by and the return clauses. XQuery inserts the expression every time the new variable is referenced. Here is the result of the query. What would be the result of the expression 12, 4, 7! Quick Check Answers 1. In the return clause, you format the resulting XML of a query. The result would be true.

You start with simple path expressions, and then use more complex path expressions with predicates. Connect to your TSQL database. Use the following XML instance for testing the navigation. Write a query that selects Customer nodes with child nodes. Select principal nodes elements in this context only. The result should be similar to the abbreviated result here. Principal nodes]; 5. Now return all nodes, not just the principal ones.

All nodes]; 6. Return comment nodes only. The result should be similar to the result here. Use the following XML instance the same as in the previous exercise for testing the navigation. Return all orders for customer 2. Customer 2 orders]; 3. Return all orders with order number , no matter who the customer is.

Return the second customer who has at least one order. Which node type test can be used to retrieve all nodes of an XML instance?

Which conditional expression is supported in XQuery? IIF B. CASE D. It is widely used, and almost all modern technologies support it. Databases simply have to deal with XML. Although XML could be stored as simple text, plain text representation means having no knowledge of the structure built into an XML document.

You could decompose the text, store it in multiple relational tables, and use relational technologies to manipulate the data. Relational structures are quite static and not so easy to change. Think of dynamic or volatile XML structures. Storing XML data in a native XML data type solves these problems, enabling functionality attached to the type that can accommodate support for a wide variety of XML technologies. Think about situations in which you have to support many different schemas for the same kind of event.

SQL Server has many such cases within it. Data deinition language DDL triggers and extended events are good examples. There are dozens of different DDL events. Each event returns different event information; each event returns data with a different schema.

Event information in XML for- mat is quite easy to manipulate. Another place to use XML is to represent data that is sparse. Your data is sparse and you have a lot of NULLs if some columns are not applicable to all rows. Standard solutions for such a problem introduce subtypes or implement an open schema model in a relational en- vironment.

However, a solution based on XML could be the easiest to implement. A solution that introduces subtypes can lead to many new tables. SQL Server introduced sparse columns and iltered indexes. Sparse columns could be another solution for having attributes that are not applicable for all rows in a table.

Sparse columns have optimized storage for NULLs. If you have to index them, you can eficiently use iltered indexes to index known val- ues only; this way, you optimize table and index storage. In addition, you can have access to all sparse columns at once through a column set. A column set is an XML representation of all the sparse columns that is even updateable.

However, with sparse columns and a column set, the schema is more complicated than a schema with an explicit XML column. You could have other reasons to use an XML model. XML inherently supports hierarchical and sorted data. If ordering is inherent in your data, you might decide to store it as XML.

You could receive XML documents from your business partner, and you might not need to shred the document to tables. It might be more practical to just store the complete XML documents in your database, without shredding. XQuery was a parameter for the query method of this type. The methods support querying the query method , retriev- ing atomic values the value method , checking existence the exist method , modifying sections within the XML data the modify method as opposed to overwriting the whole thing, and shredding XML data into multiple rows in a result set the nodes method.

You use the XML data type methods in the practice for this lesson. Note that the value method accepts an XQuery expression as the irst input parameter. The second pa- rameter is the SQL Server data type returned. The value method must return a scalar value; therefore, you have to specify the position of the element in the sequence you are browsing, even if you know that there is only one.

You can use the exist method to test if a speciic node exists in an XML instance. The exist method returns a bit, a lag that represents true or false. That means that the node searched for exists in the XML instance. The query method, as the name implies, is used to query XML data. You already know this method from the previous lesson of this chapter. It returns an instance of an untyped XML value. The XML data type is a large object type. The amount of data stored in a column of this type can be very large.

It would not be very practical to replace the complete value when all you need is just to change a small portion of it; for example, a scalar value of some subelement. The nodes method is useful when you want to shred an XML value into relational data. The result of the nodes method is a result set that contains logical copies of the original XML instances.

In those logical copies, the context node of every row instance is set to one of the nodes identiied by the XQuery expression, meaning that you get a row for every single node from the starting point deined by the XQuery expression. The nodes method has to be invoked for every row in the table. This example shows how you can make a relational database schema dynamic. Suppose that you need to store some speciic attributes only for beverages and other attributes only for condiments.

You could add an XML data type column to the Production. Products table of the TSQL database; for this example, call it additionalattributes. Because the other product categories have no additional attributes, this column has to be nullable. The following code alters the Production. Products table to add this column. With an XML schema, you constrain the possible nodes, the data type of those nodes, and more. This is exactly what you need for a dynamic schema; if you could validate XML data against a single schema only, you could not use an XML data type for a dynamic schema solution, because XML instances would be limited to a single schema.

Validation against a collection of schemas enables support of different schemas for beverages and condiments. If you wanted to validate XML values only against a single schema, you would deine only a single schema in the collection. Creating the schema is a task that should not be taken lightly. If you find an error, you can report it to us through our Submit errata page. Sign in. Your cart. This eBook includes the following formats, accessible from your Account page after purchase: EPUB The open industry format known for its reflowable content and usability on supported mobile devices.

Downloads Follow the instructions to download this book's companion files or practice files. Click the Download button below to start the download. If prompted, click Save.



0コメント

  • 1000 / 1000