pandas DataFrame: select a set of columns including a sequence of columns pandas DataFrame: select a set of columns including a sequence of columns pandas pandas

pandas DataFrame: select a set of columns including a sequence of columns


UPDATE: No need to use numpy.hstack, you can just call numpy.r_ as below

Use iloc + numpy.r_:

In [20]: df = DataFrame(randn(10, 3), columns=list('abc'))In [21]: dfOut[21]:           a         b         c0  0.228163 -1.311485 -1.3356041  0.292547 -1.636901  0.0017652  0.744605 -0.325580  0.2050033 -0.580471 -0.531553 -0.7406974  0.250574  1.076019 -0.5949155 -0.148449  0.076951 -0.6535956 -1.065314 -0.166018 -1.4715327  1.133336 -0.529738 -1.2138418 -1.715281 -2.058831  0.1132379 -0.382412 -0.072540  0.294853[10 rows x 3 columns]In [22]: df.iloc[:, r_[:2]]Out[22]:           a         b0  0.228163 -1.3114851  0.292547 -1.6369012  0.744605 -0.3255803 -0.580471 -0.5315534  0.250574  1.0760195 -0.148449  0.0769516 -1.065314 -0.1660187  1.133336 -0.5297388 -1.715281 -2.0588319 -0.382412 -0.072540[10 rows x 2 columns]

To concatenate integer ranges use numpy.r_:

In [35]: df = DataFrame(randn(10, 6), columns=list('abcdef'))In [36]: df.iloc[:, r_[:2, 2:df.columns.size:2]]Out[36]:           a         b         c         e0 -1.358623 -0.622909  0.025609 -1.1663031  0.527027  0.310530  2.892384  0.1904512 -0.251138 -1.246113  0.738264  0.0620783 -1.716028  0.419139  0.060225 -1.1915274 -1.308635  0.045396 -0.599367 -0.2024915 -0.620343  0.796364 -0.008802  0.1600206  0.199739  0.111816 -0.278119  1.0513177 -0.311206  0.090348 -0.237887  0.9582158  0.363161  2.449031  1.023352  0.7438539  0.039451 -0.855733 -0.836921 -0.835078[10 rows x 4 columns]


Now you can use similar syntax in python:

>>> from datar.all import c, f, select>>> from datar.datasets import starwars>>> >>> starwars              name    height      mass hair_color   skin_color eye_color  birth_year      sex     gender homeworld  species          <object> <float64> <float64>   <object>     <object>  <object>   <float64> <object>   <object>  <object> <object>0   Luke Skywalker     172.0      77.0      blond         fair      blue        19.0     male  masculine  Tatooine    Human1            C-3PO     167.0      75.0        NaN         gold    yellow       112.0     none  masculine  Tatooine    Droid2            R2-D2      96.0      32.0        NaN  white, blue       red        33.0     none  masculine     Naboo    Droid3      Darth Vader     202.0     136.0       none        white    yellow        41.9     male  masculine  Tatooine    Human..             ...       ...       ...        ...          ...       ...         ...      ...        ...       ...      ...4      Leia Organa     150.0      49.0      brown        light     brown        19.0   female   feminine  Alderaan    Human82             Rey       NaN       NaN      brown        light     hazel         NaN   female   feminine       NaN    Human83     Poe Dameron       NaN       NaN      brown        light     brown         NaN     male  masculine       NaN    Human84             BB8       NaN       NaN       none         none     black         NaN     none  masculine       NaN    Droid85  Captain Phasma       NaN       NaN    unknown      unknown   unknown         NaN      NaN        NaN       NaN      NaN86   Padmé Amidala     165.0      45.0      brown        light     brown        46.0   female   feminine     Naboo    Human[87 rows x 11 columns]>>> >>> starwars >> select(c(1, f[3:5], 7))              name      mass hair_color   skin_color  birth_year          <object> <float64>   <object>     <object>   <float64>0   Luke Skywalker      77.0      blond         fair        19.01            C-3PO      75.0        NaN         gold       112.02            R2-D2      32.0        NaN  white, blue        33.03      Darth Vader     136.0       none        white        41.9..             ...       ...        ...          ...         ...4      Leia Organa      49.0      brown        light        19.082             Rey       NaN      brown        light         NaN83     Poe Dameron       NaN      brown        light         NaN84             BB8       NaN       none         none         NaN85  Captain Phasma       NaN    unknown      unknown         NaN86   Padmé Amidala      45.0      brown        light        46.0[87 rows x 5 columns]>>> >>> # even with column names>>> starwars >> select(c(f.name, f[f.mass:f.skin_color], f.birth_year))              name      mass hair_color   skin_color  birth_year          <object> <float64>   <object>     <object>   <float64>0   Luke Skywalker      77.0      blond         fair        19.01            C-3PO      75.0        NaN         gold       112.02            R2-D2      32.0        NaN  white, blue        33.03      Darth Vader     136.0       none        white        41.9..             ...       ...        ...          ...         ...4      Leia Organa      49.0      brown        light        19.082             Rey       NaN      brown        light         NaN83     Poe Dameron       NaN      brown        light         NaN84             BB8       NaN       none         none         NaN85  Captain Phasma       NaN    unknown      unknown         NaN86   Padmé Amidala      45.0      brown        light        46.0[87 rows x 5 columns]

I am the author of the datar package. Feel free to submit issues if you have any questions.