= 3
n = pd.DataFrame(data=np.arange(0,n**2,1,dtype=np.int16).reshape((n,n)))#, columns=["a","b","c"])
df df
0 | 1 | 2 | |
---|---|---|---|
0 | 0 | 1 | 2 |
1 | 3 | 4 | 5 |
2 | 6 | 7 | 8 |
n = 3
df2 = pd.DataFrame(data=np.arange(0,n**2,1,dtype=np.int16).reshape((n,n)), columns=[1,3,0], index=[2,0,1])
df2
1 | 3 | 0 | |
---|---|---|---|
2 | 0 | 1 | 2 |
0 | 3 | 4 | 5 |
1 | 6 | 7 | 8 |
Pandas method reindex
selects existing indexes/columns and fills non-existing ones
0 | 1 | 2 | 3 | |
---|---|---|---|---|
0 | 5 | 3 | 0 | 4 |
1 | 8 | 6 | 0 | 7 |
2 | 2 | 0 | 0 | 1 |
3 | 0 | 0 | 0 | 0 |
Operations without elements on both df are filled with NaN
By using pandas method add
we can choose how to fill in these situations.
Other operators:
radd
sub, rsub
div, rdiv
floordiv, rfloordiv
mul, rmul
pow, rpow
0 | 1 | 2 | 3 | |
---|---|---|---|---|
0 | 5.0 | 4.0 | 2.0 | 5.0 |
1 | 11.0 | 10.0 | 5.0 | 8.0 |
2 | 8.0 | 7.0 | 8.0 | 2.0 |
3 | 1.0 | 1.0 | 1.0 | 1.0 |
Broadcasting in DataFrames
requires the additional information of matching axis, if not by rows.
0 | 1 | 2 | |
---|---|---|---|
0 | 0.000000 | 1.0 | 2.000000 |
1 | 0.750000 | 1.0 | 1.250000 |
2 | 0.857143 | 1.0 | 1.142857 |
The opposite is true for apply
, where you would select columns to have the summation happen through the columns
And can also return series. Probably how describe
and info
work
Element wise “apply” for DataFrames
is called applymap
. Equivalent to map
in Series
Creating from cut
nums = np.random.randint(0,9,20, dtype = np.int8)
bins = range(0,10,2)
cat_var = pd.cut(nums, bins)
cat_var
[(2, 4], (2, 4], (4, 6], (2, 4], (6, 8], ..., (6, 8], (2, 4], (0, 2], (6, 8], (6, 8]]
Length: 20
Categories (4, interval[int64, right]): [(0, 2] < (2, 4] < (4, 6] < (6, 8]]
array([ True, True, True, True, True, True, True, True, True,
True, True, True, True, True, True, True, True, True,
True, True])