New Immissions/Updates:
boundless - educate - edutalab - empatico - es-ebooks - es16 - fr16 - fsfiles - hesperian - solidaria - wikipediaforschools
- wikipediaforschoolses - wikipediaforschoolsfr - wikipediaforschoolspt - worldmap -

See also: Liber Liber - Libro Parlato - Liber Musica  - Manuzio -  Liber Liber ISO Files - Alphabetical Order - Multivolume ZIP Complete Archive - PDF Files - OGG Music Files -

PROJECT GUTENBERG HTML: Volume I - Volume II - Volume III - Volume IV - Volume V - Volume VI - Volume VII - Volume VIII - Volume IX

Ascolta ""Volevo solo fare un audiolibro"" su Spreaker.
CLASSICISTRANIERI HOME PAGE - YOUTUBE CHANNEL
Privacy Policy Cookie Policy Terms and Conditions
Vectorization - Wikipedia, the free encyclopedia

Vectorization

From Wikipedia, the free encyclopedia

For vectorization in mathematics, see Vectorization (mathematics). For the use of the term in computer graphics, see Raster to vector.

Vectorization, in computer science, is the process of converting a computer program from a scalar implementation, which does an operation on a pair of operands at a time, to a vectorized program where a single instruction can perform multiple operations on a pair of vector (series of adjacent values) operands. Vector processing is a major feature of both conventional and modern supercomputers.

One major research topic in computer science is the search for methods of automatic vectorization; seeking methods that would allow a compiler to convert scalar programs into vectorized programs without human assistance.

[edit] Automatic vectorization

Automatic vectorization, in the context of a computer program, refers to the automatic transformation of a series of operations performed sequentially, one operation at a time, to operations performed in parallel, several at once, in a manner suitable for processing by a vector processor.

An example would be a program to multiply two vectors of numeric data. A scalar approach would be something like:

 for (i = 0; i < 1024; i++)
 {
    C[i] = A[i]*B[i];
 }

This could be transformed to vectorized code something like:

 for (i = 0; i < 1024; i+=4)
 {
    C[i:i+3] = A[i:i+3]*B[i:i+3];
 }

Here, C[i:i+3] represents the four array elements from C[i] to C[i+3] and we assume that the vector processor can perform four operations for a single vector instruction. Since four operations are performed for an execution time roughly similar to time taken for one scalar instruction, the vector code can run up to four times faster than the original code.

There are two distinct compiler approaches: one based on the conventional vectorization technique and the other based on loop unrolling.

[edit] Loop-level automatic vectorization

This technique, used for conventional vector machines, tries to find and exploit SIMD parallelism from the loop level. It consists of two major steps as follows.

  1. Find an innermost loop that can be vectorized
  2. Transform the loop and generate vector codes

In the first step, the vectorizing compiler looks for obstacles that can prevent vectorization. A major obstacle for vectorization is true data dependence shorter than the vector length. Other obstacles include function calls and short iteration counts.

Once the loop is determined to be vectorizable, the loop is stripmined by the vector length and each scalar instructions within the loop body are replaced with the corresponding vector instructions. Below, the component transformations for this step are shown using the above example.

  • After stripmining
 for (i = 0; i < 1024; i+=4)
 {
   for (ii = 0; ii < 4; ii++)
   {
      C[i+ii] = A[i+ii]*B[i+ii];
   }
 }
  • After loop distribution using temporary arrays
 for (i = 0; i < 1024; i+=4)
 {
   for (ii = 0; ii < 4; ii++) tA[ii] = A[i+ii];
   for (ii = 0; ii < 4; ii++) tB[ii] = B[i+ii];
   for (ii = 0; ii < 4; ii++) tC[ii] = tA[ii]*tB[ii];
   for (ii = 0; ii < 4; ii++) C[i+ii] = tC[ii];
 }
  • After replacing with vector codes
 for (i = 0; i < 1024; i+=4)
 {
   vA = vec_ld( &A[i] );
   vB = vec_ld( &B[i] );
   vC = vec_mul( vA, vB );
   vec_st( vC, &C[i] );
 }

[edit] Basic block level automatic vectorization

This relatively new technique specifically targets modern SIMD architectures with short vector lengths. Although loops can be unrolled to increase the amount of SIMD parallelism in basic blocks, this technique targets to exploit SIMD parallelism within basic blocks rather than from loops. Because of this, SIMD codes can be generated from basic blocks outside loop nests. The two major steps are as follows.

  1. The innermost loop is unrolled by a factor of the vector length to form a large loop body.
  2. Isomorphic scalar instructions (that perform the same operation) are packed into a vector instruction if dependences do not prevent doing so.

To show step-by-step transformations for this approach, the same example is used again.

  • After loop unrolling (by the vector length, assumed to be 4 in this case)
 for (i = 0; i < 1024; i+=4)
 {
    sA0 = ld( &A[i+0] );
    sB0 = ld( &B[i+0] );
    sC0 = sA0 * sB0;
    st( sC0, &C[i+0] );
          ...
    sA3 = ld( &A[i+3] );
    sB3 = ld( &B[i+3] );
    sC3 = sA3 * sB3;
    st( sC3, &C[i+3] );
 }
  • After packing
 for (i = 0; i < 1024; i+=4)
 {
    (sA0,sA1,sA2,sA3) = ld( &A[i+0:i+3] );
    (sB0,sB1,sB2,sB3) = ld( &B[i+0:i+3] );
    (sC0,sC1,sC2,sC3) = (sA0,sA1,sA2,sA3) * (sB0,sB1,sB2,sB3);
    st( (sC0,sC1,sC2,sC3), &C[i+0:i+3] );
 }
  • After code generation
 for (i = 0; i < 1024; i+=4)
 {
   vA = vec_ld( &A[i] );
   vB = vec_ld( &B[i] );
   vC = vec_mul( vA, vB );
   vec_st( vC, &C[i] );
 }

Here, sA1, sB1, ... represent scalar variables and vA, vB, and vC represent vector variables.

Most automatically vectorizing commercial compilers use the conventional loop-level approach except the IBM XL Compiler that uses both.


In other languages

Static Wikipedia (no images)

aa - ab - af - ak - als - am - an - ang - ar - arc - as - ast - av - ay - az - ba - bar - bat_smg - bcl - be - be_x_old - bg - bh - bi - bm - bn - bo - bpy - br - bs - bug - bxr - ca - cbk_zam - cdo - ce - ceb - ch - cho - chr - chy - co - cr - crh - cs - csb - cu - cv - cy - da - de - diq - dsb - dv - dz - ee - el - eml - en - eo - es - et - eu - ext - fa - ff - fi - fiu_vro - fj - fo - fr - frp - fur - fy - ga - gan - gd - gl - glk - gn - got - gu - gv - ha - hak - haw - he - hi - hif - ho - hr - hsb - ht - hu - hy - hz - ia - id - ie - ig - ii - ik - ilo - io - is - it - iu - ja - jbo - jv - ka - kaa - kab - kg - ki - kj - kk - kl - km - kn - ko - kr - ks - ksh - ku - kv - kw - ky - la - lad - lb - lbe - lg - li - lij - lmo - ln - lo - lt - lv - map_bms - mdf - mg - mh - mi - mk - ml - mn - mo - mr - mt - mus - my - myv - mzn - na - nah - nap - nds - nds_nl - ne - new - ng - nl - nn - no - nov - nrm - nv - ny - oc - om - or - os - pa - pag - pam - pap - pdc - pi - pih - pl - pms - ps - pt - qu - quality - rm - rmy - rn - ro - roa_rup - roa_tara - ru - rw - sa - sah - sc - scn - sco - sd - se - sg - sh - si - simple - sk - sl - sm - sn - so - sr - srn - ss - st - stq - su - sv - sw - szl - ta - te - tet - tg - th - ti - tk - tl - tlh - tn - to - tpi - tr - ts - tt - tum - tw - ty - udm - ug - uk - ur - uz - ve - vec - vi - vls - vo - wa - war - wo - wuu - xal - xh - yi - yo - za - zea - zh - zh_classical - zh_min_nan - zh_yue - zu -

Static Wikipedia 2007 (no images)

aa - ab - af - ak - als - am - an - ang - ar - arc - as - ast - av - ay - az - ba - bar - bat_smg - bcl - be - be_x_old - bg - bh - bi - bm - bn - bo - bpy - br - bs - bug - bxr - ca - cbk_zam - cdo - ce - ceb - ch - cho - chr - chy - co - cr - crh - cs - csb - cu - cv - cy - da - de - diq - dsb - dv - dz - ee - el - eml - en - eo - es - et - eu - ext - fa - ff - fi - fiu_vro - fj - fo - fr - frp - fur - fy - ga - gan - gd - gl - glk - gn - got - gu - gv - ha - hak - haw - he - hi - hif - ho - hr - hsb - ht - hu - hy - hz - ia - id - ie - ig - ii - ik - ilo - io - is - it - iu - ja - jbo - jv - ka - kaa - kab - kg - ki - kj - kk - kl - km - kn - ko - kr - ks - ksh - ku - kv - kw - ky - la - lad - lb - lbe - lg - li - lij - lmo - ln - lo - lt - lv - map_bms - mdf - mg - mh - mi - mk - ml - mn - mo - mr - mt - mus - my - myv - mzn - na - nah - nap - nds - nds_nl - ne - new - ng - nl - nn - no - nov - nrm - nv - ny - oc - om - or - os - pa - pag - pam - pap - pdc - pi - pih - pl - pms - ps - pt - qu - quality - rm - rmy - rn - ro - roa_rup - roa_tara - ru - rw - sa - sah - sc - scn - sco - sd - se - sg - sh - si - simple - sk - sl - sm - sn - so - sr - srn - ss - st - stq - su - sv - sw - szl - ta - te - tet - tg - th - ti - tk - tl - tlh - tn - to - tpi - tr - ts - tt - tum - tw - ty - udm - ug - uk - ur - uz - ve - vec - vi - vls - vo - wa - war - wo - wuu - xal - xh - yi - yo - za - zea - zh - zh_classical - zh_min_nan - zh_yue - zu -

Static Wikipedia 2006 (no images)

aa - ab - af - ak - als - am - an - ang - ar - arc - as - ast - av - ay - az - ba - bar - bat_smg - bcl - be - be_x_old - bg - bh - bi - bm - bn - bo - bpy - br - bs - bug - bxr - ca - cbk_zam - cdo - ce - ceb - ch - cho - chr - chy - co - cr - crh - cs - csb - cu - cv - cy - da - de - diq - dsb - dv - dz - ee - el - eml - eo - es - et - eu - ext - fa - ff - fi - fiu_vro - fj - fo - fr - frp - fur - fy - ga - gan - gd - gl - glk - gn - got - gu - gv - ha - hak - haw - he - hi - hif - ho - hr - hsb - ht - hu - hy - hz - ia - id - ie - ig - ii - ik - ilo - io - is - it - iu - ja - jbo - jv - ka - kaa - kab - kg - ki - kj - kk - kl - km - kn - ko - kr - ks - ksh - ku - kv - kw - ky - la - lad - lb - lbe - lg - li - lij - lmo - ln - lo - lt - lv - map_bms - mdf - mg - mh - mi - mk - ml - mn - mo - mr - mt - mus - my - myv - mzn - na - nah - nap - nds - nds_nl - ne - new - ng - nl - nn - no - nov - nrm - nv - ny - oc - om - or - os - pa - pag - pam - pap - pdc - pi - pih - pl - pms - ps - pt - qu - quality - rm - rmy - rn - ro - roa_rup - roa_tara - ru - rw - sa - sah - sc - scn - sco - sd - se - sg - sh - si - simple - sk - sl - sm - sn - so - sr - srn - ss - st - stq - su - sv - sw - szl - ta - te - tet - tg - th - ti - tk - tl - tlh - tn - to - tpi - tr - ts - tt - tum - tw - ty - udm - ug - uk - ur - uz - ve - vec - vi - vls - vo - wa - war - wo - wuu - xal - xh - yi - yo - za - zea - zh - zh_classical - zh_min_nan - zh_yue - zu

Static Wikipedia February 2008 (no images)

aa - ab - af - ak - als - am - an - ang - ar - arc - as - ast - av - ay - az - ba - bar - bat_smg - bcl - be - be_x_old - bg - bh - bi - bm - bn - bo - bpy - br - bs - bug - bxr - ca - cbk_zam - cdo - ce - ceb - ch - cho - chr - chy - co - cr - crh - cs - csb - cu - cv - cy - da - de - diq - dsb - dv - dz - ee - el - eml - en - eo - es - et - eu - ext - fa - ff - fi - fiu_vro - fj - fo - fr - frp - fur - fy - ga - gan - gd - gl - glk - gn - got - gu - gv - ha - hak - haw - he - hi - hif - ho - hr - hsb - ht - hu - hy - hz - ia - id - ie - ig - ii - ik - ilo - io - is - it - iu - ja - jbo - jv - ka - kaa - kab - kg - ki - kj - kk - kl - km - kn - ko - kr - ks - ksh - ku - kv - kw - ky - la - lad - lb - lbe - lg - li - lij - lmo - ln - lo - lt - lv - map_bms - mdf - mg - mh - mi - mk - ml - mn - mo - mr - mt - mus - my - myv - mzn - na - nah - nap - nds - nds_nl - ne - new - ng - nl - nn - no - nov - nrm - nv - ny - oc - om - or - os - pa - pag - pam - pap - pdc - pi - pih - pl - pms - ps - pt - qu - quality - rm - rmy - rn - ro - roa_rup - roa_tara - ru - rw - sa - sah - sc - scn - sco - sd - se - sg - sh - si - simple - sk - sl - sm - sn - so - sr - srn - ss - st - stq - su - sv - sw - szl - ta - te - tet - tg - th - ti - tk - tl - tlh - tn - to - tpi - tr - ts - tt - tum - tw - ty - udm - ug - uk - ur - uz - ve - vec - vi - vls - vo - wa - war - wo - wuu - xal - xh - yi - yo - za - zea - zh - zh_classical - zh_min_nan - zh_yue - zu