Difference between revisions of "Data.table"

From kogic.kr
Line 5: Line 5:
 
<h3 style="color:#aaa; font-style:italic"><span style="font-family:courier new,courier,monospace">Input</span></h3>
 
<h3 style="color:#aaa; font-style:italic"><span style="font-family:courier new,courier,monospace">Input</span></h3>
  
<div style="background:#eee;border:1px solid #ccc;padding:5px 10px;"><span style="font-size:11px"><span style="font-family:courier new,courier,monospace">&nbsp;&nbsp;&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;Species &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; ID &nbsp; &nbsp; &nbsp; Var1 Var2 &nbsp; &nbsp; &nbsp; &nbsp; Val &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; Group &nbsp; &nbsp; &nbsp;Print_name<br />
+
<div style="background:#eee;border:1px solid #ccc;padding:5px 10px;"><span style="font-family:courier new,courier,monospace"><span style="font-size:11px">&nbsp;&nbsp;&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;Species &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; ID &nbsp; &nbsp; &nbsp; Var1 Var2 &nbsp; &nbsp; &nbsp; &nbsp; Val &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; Group &nbsp; &nbsp; &nbsp;Print_name<br />
 
&nbsp; &nbsp; 1: Dendronephthya_gigantea g30906.t1.cds@g30906@000029F CDS_length Full 2.01600e+03 Non-symbiotic_cnidarian Carnation_coral<br />
 
&nbsp; &nbsp; 1: Dendronephthya_gigantea g30906.t1.cds@g30906@000029F CDS_length Full 2.01600e+03 Non-symbiotic_cnidarian Carnation_coral<br />
 
&nbsp; &nbsp; 2: Dendronephthya_gigantea g14782.t1.cds@g14782@000108F CDS_length Full 4.02000e+02 Non-symbiotic_cnidarian Carnation_coral<br />
 
&nbsp; &nbsp; 2: Dendronephthya_gigantea g14782.t1.cds@g14782@000108F CDS_length Full 4.02000e+02 Non-symbiotic_cnidarian Carnation_coral<br />
Line 18: Line 18:
 
<h3 style="color:#aaa; font-style:italic"><span style="font-family:courier new,courier,monospace">Output</span></h3>
 
<h3 style="color:#aaa; font-style:italic"><span style="font-family:courier new,courier,monospace">Output</span></h3>
  
<div style="background:#eee;border:1px solid #ccc;padding:5px 10px;"><span style="font-size:11px"><span style="font-family:courier new,courier,monospace">&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;ID CDS_length &nbsp; &nbsp; &nbsp;FPKM<br />
+
<div style="background:#eee;border:1px solid #ccc;padding:5px 10px;"><span style="font-family:courier new,courier,monospace"><span style="font-size:11px">&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;ID CDS_length &nbsp; &nbsp; &nbsp;FPKM<br />
 
&nbsp; &nbsp; 1: &nbsp; &nbsp; &nbsp; g10.t1.cds@g10@000002F &nbsp; &nbsp; &nbsp; 1185 30.363500<br />
 
&nbsp; &nbsp; 1: &nbsp; &nbsp; &nbsp; g10.t1.cds@g10@000002F &nbsp; &nbsp; &nbsp; 1185 30.363500<br />
 
&nbsp; &nbsp; 2: &nbsp; &nbsp; g100.t1.cds@g100@000002F &nbsp; &nbsp; &nbsp; &nbsp;696 &nbsp;0.959006<br />
 
&nbsp; &nbsp; 2: &nbsp; &nbsp; g100.t1.cds@g100@000002F &nbsp; &nbsp; &nbsp; &nbsp;696 &nbsp;0.959006<br />
Line 24: Line 24:
 
&nbsp; &nbsp; 4: g10000.t1.cds@g10000@000064F &nbsp; &nbsp; &nbsp; 1074 &nbsp;0.278465<br />
 
&nbsp; &nbsp; 4: g10000.t1.cds@g10000@000064F &nbsp; &nbsp; &nbsp; 1074 &nbsp;0.278465<br />
 
&nbsp; &nbsp; 5: g10001.t1.cds@g10001@000064F &nbsp; &nbsp; &nbsp; &nbsp;522 &nbsp;0.962268&nbsp; &nbsp;&nbsp;</span></span></div>
 
&nbsp; &nbsp; 5: g10001.t1.cds@g10001@000064F &nbsp; &nbsp; &nbsp; &nbsp;522 &nbsp;0.962268&nbsp; &nbsp;&nbsp;</span></span></div>
 +
 +
<h1><span style="font-family:courier new,courier,monospace">Data table aggregation with &#39;by&#39;</span></h1>
 +
 +
<h3 style="color:#aaa;font-style:italic;"><span style="font-family:courier new,courier,monospace">Input</span></h3>
 +
 +
<div style="background:#eee;border:1px solid #ccc;padding:5px 10px;"><span style="font-family:courier new,courier,monospace"><span style="font-size:11px">&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;ID CDS_length &nbsp; &nbsp; &nbsp;FPKM &nbsp; ord bin<br />
 +
&nbsp; &nbsp; 1: &nbsp; &nbsp; &nbsp; g10.t1.cds@g10@000002F &nbsp;10.210671 4.9242662 10665 &nbsp; 8<br />
 +
&nbsp; &nbsp; 2: g10002.t1.cds@g10002@000064F &nbsp;12.039262 2.3361320 &nbsp;3975 &nbsp; 3<br />
 +
&nbsp; &nbsp; 3: g10008.t1.cds@g10008@000073F &nbsp; 9.162391 0.6201266 &nbsp; 856 &nbsp; 1<br />
 +
&nbsp; &nbsp; 4: g10011.t1.cds@g10011@000073F &nbsp; 9.942515 1.9781956 &nbsp;3149 &nbsp; 3<br />
 +
&nbsp; &nbsp; 5: g10012.t1.cds@g10012@000073F &nbsp;10.762382 0.5596289 &nbsp; 785 &nbsp; 1</span></span></div>
 +
 +
<h3 style="color:#aaa;font-style:italic;"><span style="font-family:courier new,courier,monospace">Code</span></h3>
 +
 +
<div style="background:#eee;border:1px solid #ccc;padding:5px 10px;"><span style="font-family:courier new,courier,monospace"><big>dt[, .(Mean_CDS_length = mean(CDS_length), Mean_FPKM = mean(FPKM)), by=bin]</big></span></div>
 +
 +
<h3 style="color:#aaa;font-style:italic;"><span style="font-family:courier new,courier,monospace">Output</span></h3>
 +
 +
<div style="background:#eee;border:1px solid #ccc;padding:5px 10px;"><span style="font-family:courier new,courier,monospace"><span style="font-size:11px">&nbsp; &nbsp; bin Mean_CDS_length Mean_FPKM<br />
 +
&nbsp;1: &nbsp; 8 &nbsp; &nbsp; &nbsp; 10.187520 4.7044951<br />
 +
&nbsp;2: &nbsp; 3 &nbsp; &nbsp; &nbsp; 10.590668 2.0831168<br />
 +
&nbsp;3: &nbsp; 1 &nbsp; &nbsp; &nbsp; 10.488467 0.4904325<br />
 +
&nbsp;4: &nbsp; 4 &nbsp; &nbsp; &nbsp; 10.550280 2.6412267<br />
 +
&nbsp;5: &nbsp; 2 &nbsp; &nbsp; &nbsp; 10.541246 1.3801430<br />
 +
&nbsp;6: &nbsp; 7 &nbsp; &nbsp; &nbsp; 10.344120 4.1326888<br />
 +
&nbsp;7: &nbsp; 6 &nbsp; &nbsp; &nbsp; 10.377043 3.6221557<br />
 +
&nbsp;8: &nbsp;10 &nbsp; &nbsp; &nbsp; &nbsp;9.563917 7.4570552<br />
 +
&nbsp;9: &nbsp; 5 &nbsp; &nbsp; &nbsp; 10.509382 3.1425986<br />
 +
10: &nbsp; 9 &nbsp; &nbsp; &nbsp; &nbsp;9.941699 5.4668993</span></span></div>
 +
 +
<p>&nbsp;</p>

Revision as of 16:23, 27 November 2018

YalPak_Rtip

Convert a molten data table into an data table (array type)

Input

                       Species                           ID       Var1 Var2         Val                   Group      Print_name

    1: Dendronephthya_gigantea g30906.t1.cds@g30906@000029F CDS_length Full 2.01600e+03 Non-symbiotic_cnidarian Carnation_coral
    2: Dendronephthya_gigantea g14782.t1.cds@g14782@000108F CDS_length Full 4.02000e+02 Non-symbiotic_cnidarian Carnation_coral
    3: Dendronephthya_gigantea   g9986.t1.cds@g9986@000064F CDS_length Full 8.40000e+02 Non-symbiotic_cnidarian Carnation_coral
    4: Dendronephthya_gigantea   g1279.t1.cds@g1279@000024F CDS_length Full 8.58000e+02 Non-symbiotic_cnidarian Carnation_coral

    5: Dendronephthya_gigantea   g9325.t1.cds@g9325@000042F CDS_length Full 8.61000e+02 Non-symbiotic_cnidarian Carnation_coral

Code

dcast(dt, ID ~ Var1, value.var = "Val")

Output

                                 ID CDS_length      FPKM

    1:       g10.t1.cds@g10@000002F       1185 30.363500
    2:     g100.t1.cds@g100@000002F        696  0.959006
    3:   g1000.t1.cds@g1000@000011F        660  0.000000
    4: g10000.t1.cds@g10000@000064F       1074  0.278465

    5: g10001.t1.cds@g10001@000064F        522  0.962268    

Data table aggregation with 'by'

Input

                                 ID CDS_length      FPKM   ord bin

    1:       g10.t1.cds@g10@000002F  10.210671 4.9242662 10665   8
    2: g10002.t1.cds@g10002@000064F  12.039262 2.3361320  3975   3
    3: g10008.t1.cds@g10008@000073F   9.162391 0.6201266   856   1
    4: g10011.t1.cds@g10011@000073F   9.942515 1.9781956  3149   3

    5: g10012.t1.cds@g10012@000073F  10.762382 0.5596289   785   1

Code

dt[, .(Mean_CDS_length = mean(CDS_length), Mean_FPKM = mean(FPKM)), by=bin]

Output

    bin Mean_CDS_length Mean_FPKM

 1:   8       10.187520 4.7044951
 2:   3       10.590668 2.0831168
 3:   1       10.488467 0.4904325
 4:   4       10.550280 2.6412267
 5:   2       10.541246 1.3801430
 6:   7       10.344120 4.1326888
 7:   6       10.377043 3.6221557
 8:  10        9.563917 7.4570552
 9:   5       10.509382 3.1425986

10:   9        9.941699 5.4668993