美文网首页
Data Flow Transformations(2):sor

Data Flow Transformations(2):sor

作者: 鲸鱼酱375 | 来源:发表于2019-10-02 09:56 被阅读0次

Sort

  • Sorts input data in ascending or descending order.

  • Multiple sorts can be applied; each sort is identified by a numeral that determines the sort order.

  • Positive number denotes ascending order, and a negative number denotes descending order.

  • Column with the lowest number is sorted first, second lowest number is sorted next, and so on.

  • It can also remove duplicate rows as part of its sort.

  • It has one input and one output. It does not support error outputs.

  • full blocking


    image.png

how to sort the output
change the source output, false to true


image.png
image.png

Aggregate

  • Used to perform following operations on input columns:
    ▶Sum, Average, Count, Count Distinct, Minimum, Maximum
    ▶Group By
    ▶The comparison options of the aggregation

  • Handles null values in the same way as the SSMS does.

  • IsBig property can be set for handling big or high-precision numbers.

  • Performance Can be Improved by:
    ▶Set Keys or KeysScale properties When performing a Group by operation
    ▶Set CountDistinctKeys or CountDistinctScale properties while performing Distinct count operation.
    ▶because the tranformation is able to allocate adequate memory for the data that the transformation caches.

  • The Aggregate transformation has one input and one or more outputs. It does not support an error output.

  • full blocking

count distinct ask more space from ram???
isbig

Union All

  • Combines multiple inputs into one output.
  • Inputs are added to the transformation output one after the other; no reordering of rows occurs.
  • At least one input must be mapped to each output column.
  • Metadata of the columns must match. dayatype must be matched
  • Columns that are not mapped, are set to null values in output.
  • This transformation has multiple inputs and one output. It does not support an error output.
  • semi-blocking

Merge

  • Combines two sorted datasets into a single dataset.
  • Merging is based on values in their key columns.
  • It can be used to perform:
    ▶Merge data from two data sources.
    ▶Create complex datasets by nesting Merge transformations.
    ▶Remerge rows after correcting errors in the data.
  • Inputs have matching metadata.
  • This transformation has two inputs and one output. It does not support an error output.
  • semi blocking

Merge Join

  • Combines two sorted datasets into a single dataset by using a FULL, LEFT, or INNER join.
  • It can configured as follows:
    ▶Specify the join: FULL, LEFT, or INNER join.
    ▶Specify the columns the join uses.
    ▶Specify whether the transformation handles null values as equal to other nulls.
  • Joining columns have matching metadata.
  • This transformation has two inputs and one output. It does not support an error output.
  • semi blocking
  • swap input

sorting in the property, it can improve performance(?)

Conditional Split

  • It can route data rows to different outputs depending on the content of the data.

  • It evaluates expressions, and based on the results, directs the data row to the specified output.

  • It can configured as follows:
    ▶Specify an expression that evaluates to a Boolean for each condition.
    ▶Specify the order in which the conditions are evaluated.
    ▶Specify the default output for the transformation.

  • This transformation has one input, one or more outputs and one error output.

  • like case statement in sql

  • non-blocking

  • 注意大小写

eg: [id]>1 && [id]<5

Multicast

  • Multicast transformation distributes its input to one or more outputs.
  • Multicast transformation directs every row to every output, unlike Conditional Split directs a row to a single output.
  • You configure the Multicast transformation by adding outputs.
  • This transformation has one input and multiple outputs. It does not support an error output.
  • non- blocking
  • like copy the dataset

区别: conditional的是符合条件的去,multicast是复制粘贴,不管条件

Row Count

  • Count the number of rows, as they pass through a data flow and stores the final count in a variable.
  • Variable used Must already exist.
  • Variable must be in the scope of Data Flow Task using Row Count.
  • Row count value is stored in the variable only after the last row has passed the transformation.
  • This transformation has one input and one output. It does not support an error output
  • non-blocking
  • https://www.c-sharpcorner.com/UploadFile/muralidharan.d/how-to-use-rowcount-in-ssis/

Audit

  • Audit transformation enables the data flow in a package to include data about the environment in which the package runs.
  • SSIS includes system variables that provide this information.
  • It can be configured as:
    ▶Provide the name of a output column.
    ▶Map the system variable to the output column.
  • Single system variable can be mapped to multiple columns.
  • This transformation has one input and one output. It does not support an error output.
  • non blocking

how to show the num

相关文章

网友评论

      本文标题:Data Flow Transformations(2):sor

      本文链接:https://www.haomeiwen.com/subject/bnvaectx.html