Reproducibility Package

Scientific work only has value if it can be understood, verified, and reproduced. Your Reproducibility Package documents how you achieved your results and provides everything needed for another researcher to repeat your main findings.

This deliverable is graded jointly under Results, Analysis & Reflection and Reproducibility criteria.

Purpose

The goal of the reproducibility package is not just to archive your files — it is to make your work transparent, traceable, and reusable. Think of it as your scientific fingerprint: a minimal, well-documented bundle that anyone with the right background could rerun.

Some of you will be continuing or building upon projects from previous student groups. If you find that parts of older projects are difficult to understand or reproduce, that frustration is exactly what we aim to eliminate. Your task is to ensure that future students can build on your work without ambiguity, lost data, or missing context. Good documentation is not only a courtesy — it is what turns your project into a lasting contribution to the CACoM ecosystem.

Required Contents

Your submission must include all materials needed to interpret or reproduce your project. Exact contents vary by project type, but the structure below applies to most teams.

Component	Description
Report	A short (3–6 pages) write-up summarizing motivation, methods, results, and conclusions. It may extend the poster or serve as a compact paper.
Code	All scripts, notebooks, or programs required to generate your key figures and results. Clearly labeled and runnable.
Data / data description	Either small data files or detailed instructions on how to obtain them (e.g., dataset name, link, preprocessing steps).
Figures & diagrams	Core plots, flowcharts, or architecture diagrams that illustrate your analysis or experimental setup.
README.md	A concise guide describing what's inside the package and how to reproduce your key results.
License / credits	Indicate sources of data and code; include any licenses where required.

tip

If you used a private or sensitive dataset, describe the data access procedure and provide a small synthetic example to demonstrate your workflow.

note

Exact structure may vary — for instance, survey or review-based projects may include their questionnaire, PRISMA flow diagram, or coding scheme instead of raw data files. The key is transparency and reproducibility within your specific project type.

Data Upload Policy

It is essential to distinguish what may be uploaded publicly and what must remain private.

Platform	What to Upload	What Not to Upload
GitHub (optional)	- Code, documentation, and synthetic or open datasets. - Small mock examples that illustrate the workflow. - Figures or plots generated from anonymized data.	❌ Real clinical, patient, or proprietary datasets. ❌ Any identifiable or licensed material that you do not own the rights to share. ❌ Raw recordings from CTG, IMU, or similar sensors.
Google Drive (official CACoM submission folder)	✅ All materials required for full internal reproducibility, including the actual datasets used (provided licenses and ethics allow internal academic sharing).	❌ Do not share Google Drive links publicly. This upload is strictly for course instructors and examiners.

🚨 Important

Never upload real patient or proprietary data to GitHub or any public service. If you do so, you may violate data protection laws and TUM regulations. Upload such data only to the private, course-managed Google Drive folder designated for your submission.

note

All materials must be submitted to the Google Drive folder assigned to your group number, and confirmed by email to Prof. Martin Daumer (CC Pooja N. Annaiah). The folder link will be shared after topic approval.

Good Practices

Use relative paths and fixed random seeds where applicable.
Include environment files (environment.yml, requirements.txt, or Manifest.toml).
Comment your code and label outputs with figure numbers that match your report/poster.
Store figures as static images (PNG, PDF, SVG) instead of screenshots.
Prefer open formats over proprietary ones (CSV > Excel, PNG > PowerPoint).
If using Jupyter/Pluto notebooks, ensure they run from top to bottom without manual intervention.
Provide clear version notes or changelogs if building upon previous groups' work.

Verification and Testing

Before submission:

Delete large intermediate data files and re-run your entire pipeline from raw inputs.
Verify that every figure and table in your report can be regenerated.
Have one team member unfamiliar with the code follow the README instructions — if they succeed, your package is reproducible.
Be prepared to walk through your entire pipeline and explain how results were produced during your poster session discussion.

note

You are not expected to produce a production-level software package.
Clarity, completeness, and honesty matter more than perfection.

Ethical and Legal Considerations

Do not upload identifiable or patient data to public repositories.
Respect dataset licenses and include attributions.
If your project involved external collaborators or clinical data, ensure you have written permission to share materials.
For CTG or IMU data collected under institutional agreements, all sharing must remain internal.

Submission Format

Item	Format	Notes
Main package	`.zip` or `.tar.gz`	Upload to your group’s Google Drive folder
Notification	Email to Prof. Martin Daumer, CC Pooja N. Annaiah	Include group number and project title in the subject line
GitHub repository (optional)	Public or private link	Must contain only non-sensitive code and documentation
File size	Keep your submission as compact as reasonable — remove unnecessary intermediates and compress large files. If your data are genuinely large (e.g. multi-GB recordings), briefly note this in your README and coordinate with instructors.

Common Pitfalls

caution

Uploading real clinical or proprietary data to public GitHub
Missing README or unclear instructions
Using absolute file paths that break on other systems
Omitting data descriptions or environment files
Submitting code that does not run cleanly
Forgetting to include the report or figures referenced in your poster
Uploading only code without explanation or context

Quick Checklist

All code and figures regenerate key results
Data sources clearly documented or included in Google Drive submission
README explains exactly how to reproduce results
No sensitive or proprietary data uploaded to public GitHub
Package runs on a clean environment
Uploaded to Google Drive before the final deadline

Purpose​

Required Contents​

Data Upload Policy​

Good Practices​

Verification and Testing​

Ethical and Legal Considerations​

Submission Format​

Common Pitfalls​

Quick Checklist​

Related Pages​

Purpose

Required Contents

Data Upload Policy

Good Practices

Verification and Testing

Ethical and Legal Considerations

Submission Format

Common Pitfalls

Quick Checklist

Related Pages