Got it! This site uses cookies. You consent to this by clicking on "Got it!" or by continuing to use this website.nbsp; Note: This appears on each machine/browser from which this site is accessed.
You are not logged in. Go to Login page. You need to login before you can view more content. (content omitted that requires login)
4. Zip read of data
Note: some additional information will be provided on how to control the amount of output, etc.
5. Requirements
Write a Python program to do the following.
In data science, one often needs to get data from zipped data files on a regular basis. It is better to automate this process rather than do it manually.
Find some data from your project that is in a zip file format. If not, then find a zip file such as the following (but not an OpenXML file).
Zip data file (preferred)
XPI (Cross Platform Installer) - Mozilla Add-Ins
APK (Android Package)
JAR (Java Archive)
... but not on OpenXML file such as docx, xlsx, pptx, etc.
Create a Python program that reads the zip file and prints the first few lines of a few selected files.
If the zip file is more than, say, 100 KB, then do not put that zip file in the folder that is being submitted. Instead, include a few relevant parts of the code in your document and some of the output with interspersed explanatory text.
The Python program should be self-contained and not require any other files other than standard packages covered in class.
In conveying what you have done, create a docx document that has the following parts.
A title with your name below the title.
A section, with a short introduction of your problem.
A section, with some explanation and a table of your data.
A section, with some explanation and a chart (image) of your data (from the table).
A section that concludes the document.
The submission will be in the form of a zip file. The zip file should contain the following.
The Python code files used. There should be no external references in the Python code other than to imports of known libraries. All file and other references should be to the current folder.
Do not include Python code files not used.
The document in docx form as one of the files in the folder. You can either create this document manually or use Python to generate it.
Any supporting files such as the generated image, data, etc. But, in this case, not the zip file used for input unless it is small.
There is a limit of about 1MB for the submission. Do not include any files not specifically needed for this work. Do not use subfolders - put everything in the root folder with files being submitted.
If the requirements are unclear in any way, use any provided input and output to resolve the discrepancy.
6. Domain knowledge
Domain knowledge is the background knowledge that is useful in solving a problem, designing a solution, implementing the solution, etc. The following domain knowledge may be useful for this requirement.
The domain data will depend on your project data.
Note: In addition to specific domain knowledge, you should be familiar with all concepts covered to this point in the course.
7. Coding notes
The following coding examples and/or notes may be of use for this requirement.
See the class notes for additional relevant coding content.
Note: The code below will be specific to your project and data. Thus, there is no one solution to this work. The code and zip file for download is for a template for getting started and for submission.
Make all necessary assumptions. Make no unnecessary assumptions.
In addition to specific coding examples, you should be familiar with all concepts covered to this point in the course.
8. Starter program
You are provided with the following starter program in file zipread.py.
Do not remove any comment that starts with two hash signs.
Here is the Python code.
You are to fill in the missing parts of the program according to the work requirements (see above).
In the comments at the top, you are to fill in the author (your name), help received (person and type of help), and pseudo-code parts which are not in the solution (usually provided, but you need to add them). Remove the parentheses too.
9. Possible solution
Begin solution
Here is a possible solution to the above problem. Note: Once a solution is provided, or the day before the next class, further submissions for this work receive no credit.
End solution
10. Scoring rubric
CS 496 - A10 : Asmt#10: Zip data read
Your grade: _ / 30
[LATE] Late or redo penalty: _ / -30
[SUBMIT] Not submitted properly: _ / -30
[STYLE] Document style requirements: _ / -10
[PROGRAM] Program content requirements: _ / 9
[OUTPUT] Program output in document: _ / 6
[CONTENT] Document content requirements: _ / 15
[CREDIT] Extra credit: _ / +6
Comments: