Python is a widely used high-level, general-purpose, interpreted programming language. It is often used as the "glue" within the High Performance Computing community.
For more information about Spark and PySpark, you can visit the following resources:
https://en.wikipedia.org/wiki/Python_(programming_language)
Getting Started with Python
You can access and start using Python with the following steps:
- Connect to the Tufts High Performance Compute Cluster. See Connecting for a detailed guide.
Load the Python module with the following command:
module load python
Note that you can see a list of all available modules (potentially including different versions of Python) by typing:
module avail
You can specify a specific version of Python with the module load command or use the generic module name (python) to load the latest version.
Start a Python session by typing:
python
print("Hello, World!")
For a more detailed overview of Python and how it relates to Big Data or High Performance Computing (HPC) please contact tts-research@tufts.edu for information regarding future workshops.