I have scoured Stack over flow and the Pandas documentation for a solution to this issue.
I am attempting to recursively move through a directory and concatenate all of the headers and their respective row values.
Below is what I have so far after much experimentation with other libraries:
import pandas as pd
import csv
import glob
import os
path = '.'
files_in_dir = [f for f in os.listdir(path) if f.endswith('csv')]
for filenames in files_in_dir:
df = pd.read_csv(filenames)
df.to_csv('out.csv', mode='a')
However all of the headers and their corresponding values are stacked upon each other. In addition, the files' headers and their corresponding value repeat twice (something to do with the for loop). My constraints are:
Writing out the headers and their corresponding values (without "stacking") - essentially concatenated one after the other
If the column headers in one file match another files then their should be no repetition. Only the values should be appended as they are written to the one CSV file.
Since each file has different column headers and different number of column headers these should all be added sequentially during processing. Nothing should be deleted.
I am wondering if the best method is to merge, concatenate or perform another method using pandas? Thanks.