News and Announcements from OSG Operations > GOC Service Update - Tuesday, February 28th at 14:00

The GOC will upgrade the following services beginning Tuesday, 28/Feb at 14:00 UTC. The GOC reserves 8 hours in the unlikely event unexpected problems are encountered.

Data

* Update internal database servers data1 and data 2

Reports

* Upgrade production report generator to CentOS 6.8

VOMS

* Upgrade production instance to CentOS 6.8
* Take over operation of OSG VO voms from FNAL

Twiki

* Back up configuration changes

Web Services

* Basic updates and patches to OSG web presence.

All Services

* Operating system updates; reboots will be required. The usual HA mechanisms will be used, but some services will experience brief outages.

News and Announcements from OSG Operations > Announcing OSG CA Certificate Update

We are pleased to announce a data release for the OSG Software Stack.
Data releases do not contain any software changes.

This release contains updated CA Certificates based on IGTF 1.80:
- Discontinued BEGrid2008 (BELNET) classic authority (BE)

Release notes and pointers to more documentation can be found at:

https://www.opensciencegrid.org/bin/view/Documentation/Release3/Release33212

Need help? Let us know:

https://www.opensciencegrid.org/bin/view/Documentation/Release3/HelpProcedure

We welcome feedback on this release!

Condor Project News > HTCondor Week 2017 web page available ( February 15, 2017 )

The HTCondor Week 2017 web page is now available. This web page includes information about nearby hotel options (note that HTCondor Week 2017 will be held at a different location than the last few HTCondor Weeks, so that may affect your hotel choice). Registration should be open by the end of February; at this time we anticipate a registration fee of $85/day.

News and Announcements from OSG Operations > Announcing OSG Software version 3.3.21

We are pleased to announce OSG Software version 3.3.21.

Changes to OSG 3.3.21 include:
- osg-configure 1.6.1: Additional support for ATLAS AGIS
- GlideinWMS 3.2.17: See Release Notes
- Important bug fixes for CVMFS server
- HTCondor 8.4.11: Final bug fix release of the 8.4 series
- HTCondor-CE 2.1.2: Avoid crash by being more liberal in what we accept
- Added 2 scripts to manage VO and CA certificates in tarball installations

Changes to the Upcoming Repository include:
- Singularity 2.2.1: Moderate Severity security update

Release notes and pointers to more documentation can be found at:

https://www.opensciencegrid.org/bin/view/Documentation/Release3/Release3321

Need help? Let us know:

https://www.opensciencegrid.org/bin/view/Documentation/Release3/HelpProcedure

We welcome feedback on this release!

Derek's Blog > Deploying Docs on Github with Travis-CI

It is very common to deploy docs from a Github repo to a Github Pages site. In the past few days, I have setup several repos that will push to Github Pages using the Travis-CI continuous integration, and I wanted to document how easy it is here.

Create Deploy Key

After the repo is created, the first step is to create a deploy key.

ssh-keygen -t rsa -b 4096 -C "djw8605@gmail.com" -f deploy-key

Add the deploy-key.pub contents to to your repo’s settings under Settings -> Deploy Keys. Be sure to check the “Allow write access”. The deploy key will be used to authenticate the travis-ci build in order to push the website.

We will next have to encrypt the deploy-key so we can commit it to our repository safely.

Encrypt Deploy-key

First, you will need to install the travis command line tools, which is a Ruby Gem. After installing ruby, you can run the command:

gem install travis

Next, you will need to enable the repo to be build on Travis-CI. Log into Travis-CI and go to “Account”. Within this menu, search for the name of your repo, and click to enable it.

Enable Travis-CI Repo

Inside the repository’s git repo on your own computer, run the command:

travis encrypt-file deploy-key
...    
openssl aes-256-cbc -K $encrypted_1d262b48bc9b_key -iv $encrypted_1d262b48bc9b_iv -in deploy-key.enc -out deploy-key -d

This will encrypt the deploy-key with the Travis-CI public key, therefore it can only be accessed on the Travis-CI infrastructure. The above line is very important to remember, you will copy / paste it into the .travis.yml.

Configure Travis-CI

For most of my Travis-CI configurations, I copy from my previous configurations. Travis-CI is configured in a specially named file in your repo named .travis.yml. Here is an example configuration that builds MkDocs documentation.

env:
  global:
  - GIT_NAME: "'Markdown autodeploy'"
  - GIT_EMAIL: djw8605@gmail.com
  - GH_REF: git@github.com:opensciencegrid/security.git
language: python
before_script:
- pip install mkdocs
- pip install MarkdownHighlight
script:
- openssl aes-256-cbc -K $encrypted_1d262b48bc9b_key -iv $encrypted_1d262b48bc9b_iv -in deploy-key.enc -out deploy-key -d
- chmod 600 deploy-key
- eval `ssh-agent -s`
- ssh-add deploy-key
- git config user.name "Automatic Publish"
- git config user.email "djw8605@gmail.com"
- git remote add gh-token "${GH_REF}";
- git fetch gh-token && git fetch gh-token gh-pages:gh-pages;
- if [ "${TRAVIS_PULL_REQUEST}" = "false" ]; then echo "Pushing to github"; PYTHONPATH=src/ mkdocs gh-deploy -v --clean --remote-name gh-token; git push gh-token gh-pages; fi;

You can see that the openssl command that was printed while encrypting is in the script section. Be sure to copy / paste it completely into your .travis.yml file.

This file instructs Travis-CI to:

  1. Install mkdocs
  2. Decrypt the deploy-key
  3. Builds the mkdocs documentation.
  4. Push the docs to the gh-pages branch of the repo.

Commit and be prosperous

Commit the travis.yml, the deploy-key.enc. Be sure not to commit the deploy-key. And everything should be good to go!

The above examples where from the OSG Security docs repo.

Build Status


News and Announcements from OSG Operations > GOC Service Update - Tuesday, February 14th at 14:00 UTC

The GOC will upgrade the following services beginning Tuesday, February 14th at 14:00 UTC. The GOC reserves 8 hours in the unlikely event unexpected problems are encountered.

Oasis
- Upgrade oasis and oasis-replica to CenOS 7.3

OIM
- Update to OIM to reject requests for host certs that put the user over quota - https://jira.opensciencegrid.org/browse/SOFTWARE-2472
- OIM 3.68

Event
- SSL/TLS integration for RabbitMQ instance

Reports
- Upgrade production report generator to CentOS 6.8

VOMS
- Upgrade production instance to CentOS 6.8
- Take over operation of OSG VO voms from FNAL

Pegasus news feed > Pegasus 4.7.3 Released

We are happy to announce the release of Pegasus 4.7.3. Pegasus 4.7.3 is a minor release of Pegasus and includes improvements and bug fixes to the 4.7.2 release. It has a bug fix without which monitoring will break for users running with HTCondor 8.5.8 or higher.

Improvements

  1. [PM-1109] – dashboard to display errors if a job is killed instead of exiting with non zero exitcode
    • pegasus-monitord did not pass signal information from the kickstart records to the monitoring database. If a job fails and because of a signal, it will now create an error message indicating the signal information, and populate it.
  2. [PM-1129] – dashboard should display database and pegasus version
  3. [PM-1138] – Pegasus dashboard pie charts should distinguish between running and unsubmitted
  4. [PM-1155] – remote cleanup jobs should have file url’s if possible

Bugs Fixed

  1. [PM-1132] – Hashed staging mapper doen’t work correctly with sub dax generation jobs
    • For large workflows with dax generation jobs, the planning broke for sub workflows if the dax was generated in a hashed directory structure. It is now fixed.
    • Note: As a result of this fix, pegasus-plan prescripts for sub workflows in all cases, are now invoked by pegasus-lite
  2. [PM-1135] – pegasus.transfer.bypass.input.staging breaks symlinking on the local site
  3. [PM-1136] – With bypass input staging some URLs are ending up in the wrong site
  4. [PM-1147] – pegasus-transfer should check that files exist before trying to transfer them
    • In case where the source file url’s don’t exist, pegasus-transfer used to still attempt multiple retries resulting in hard to read error messages. This was fixed, whereby pegasus-transfer does not attempt retries on a source if a source file does not exist.
  5. [PM-1151] – pegasus-monitord fails to populate stampede DB correctly when workflow is run on HTCondor 8.5.8
  6. [PM-1152] – pegasus-analyzer not showing stdout and stderr of failed transfer jobs
    • In case of larger stdout+stderr outputted by an application, we store only first 64K in the monitoring database combined for a single or clustered job. There was a bug whereby if a single task outputted more than 64K nothing was populated. This is fixed
  7. [PM-1153] – Pegasus creates extraneous spaces when replacing <file name=”something” />
    • DAX parser was updated to not add extraneous spaces when constructing the argument string for jobs
  8. [PM-1154] – regex too narrow for GO names with dashes
  9. [PM-1157] – monitord replay should work on submit directories that are moved
    • pegasus generated submit files have absolute paths. However, for debugging purposes where a submit directory might be moved to a different host, where the paths don’t exist. monitord now searches for files based on relative paths from the top level submit directory. This enables users to repopulate their workflow databases easily.

147 views


News and Announcements from OSG Operations > XSEDE Ticket Exchange

Colleagues,

The OSG Operations ticketing interface is now able to exchange tickets with the XSEDE ticketing interface.  This means that if you are having issues with XSEDE or need support from the XSEDE operators, you can open a ticket with us (https://ticket.opensciencegrid.org) and let us know that the ticket needs to go to XSEDE.  Once we have this information we can send the ticket to XSEDE and they will be able to work in their ticket environment while you continue to work in the OSG environment.  

This should help everyone on both sides in their support workflows.  Please let us know if you have any questions.

News and Announcements from OSG Operations > Announcing OSG CA Certificate and VO Package Updates

We are pleased to announce a data release for the OSG Software Stack.
Data releases do not contain any software changes.

This release contains updated CA Certificates based on IGTF 1.79:
- Updated UNLPGrid CA with extended validity period (AR)
- Fixed regular expressions in CILogon and NCSA CA namespaces files (US)
- Included rollover CA IRAN-GRID-CGC-G2 (IR)
- Corrected an incorrect line in selected info files for DigiCert (US)
- Discontinued expiring NECTEC CA (TH)

This release also contains VO Package v70:
- Deleted MCDRD VO

Release notes and pointers to more documentation can be found at:

https://www.opensciencegrid.org/bin/view/Documentation/Release3/Release33202

Need help? Let us know:

https://www.opensciencegrid.org/bin/view/Documentation/Release3/HelpProcedure

We welcome feedback on this release!

News and Announcements from OSG Operations > Reminder: OSG All-Hands Meeting Registration is now open! Registration Closes Feb 19!

REMINDER: Registration closes in just 25 days!

OSG All Hands Meeting 2017 - San Diego Supercomputer Center - University
of California San Diego

Registration for the All-Hands Meeting of the Open Science Grid hosted by
the San Diego Supercomputer Center (http://www.sdsc.edu/), March 6-9,
2017, La Jolla, CA

Topics to be discussed will include:

* Cyberinfrastructure partnerships: university research computing
HPC centers, XSEDE XD providers, DOE laboratories, NSF Large Facility
computing organizations and commercial cloud providers. Technologies,
strategies, ideas, and discussions on how OSG can foster partnerships
across the widest possible range of CI.

* How high throughput computing accelerates research, and how OSG can help
users scale up.

* Usability challenges and solutions for distributed high throughput
computing applications.

* Connecting virtual organizations, campus researchers and XSEDE users to
the OSG: command line, science gateways, and workflow frameworks.

* Training and education workshops.

* Serving more of the "long tail" of science with high throughput parallel
computing: incorporating multi-core, GPU and virtual cluster resources
into science workflows using shared and allocated distributed
infrastructure.

* Advanced network analytics services for national Science DMZ
infrastructure.

As has been the custom, the 2017 OSG AHM will be co-located with the U.S.
Large Hadron Collider (LHC at CERN) computing facility meetings.

Logistical information, registration and agenda are available at
https://www.opensciencegrid.org/AHM2017 (This will redirect to the
eiseverywhere.com domain.)

If you have any questions, please see the contact page on the above link
for who to email.

We look forward to seeing you in March!

Condor Project News > HTCondor 8.6.0 released! ( January 26, 2017 )

The HTCondor team is pleased to announce the release of HTCondor 8.6.0. After a year of development, this is the first release of the new stable series. Highlights of this release are: condor_q shows shows only the current user's jobs by default; condor_q summarizes related jobs (batches) on a single line by default; Users can define their own job batch name at job submission time; Immutable/protected job attributes make SUBMIT_REQUIREMENTS more useful; The shared port daemon is enabled by default; Jobs run in cgroups by default; HTCondor can now use IPv6 addresses (Prefers IPv4 when both present); DAGMan: Able to easily define SCRIPT, VARs, etc., for all nodes in a DAG; DAGMan: Revamped priority implementation; DAGMan: New splice connection feature; New slurm grid type in the grid universe for submitting to Slurm; Numerous improvements to Docker support; Several enhancements in the python bindings. Further details can be found in the Upgrade Section. A few bugs were fixed during our stabilization effort. More details about the fixes can be found in the Version History. HTCondor 8.6.0 binaries and source code are available from our Downloads page.

Condor Project News > HTCondor 8.4.11 released! ( January 23, 2017 )

The HTCondor team is pleased to announce the release of HTCondor 8.4.11. A stable series release contains significant bug fixes. This release is the last planned release of the 8.4 series. Highlights of this release are: Fixed a bug which delayed startd access to stard cron job results; Fixed a bug in pslot preemption that could delay jobs starting; Fixed a bug in job cleanup at job lease expiration if using glexec; Fixed a bug in locating ganglia shared libraries on Debian and Ubuntu. Further details can be found in the Version History. HTCondor 8.4.11 binaries and source code are available from our Downloads page.

News and Announcements from OSG Operations > 2017 Open Science Grid All-Hands Meeting Registration Now Open!

OSG All Hands Meeting 2017 - San Diego Supercomputer Center - University
of California San Diego

Registration for the All-Hands Meeting of the Open Science Grid hosted by
the San Diego Supercomputer Center (http://www.sdsc.edu/), March 6-9,
2017, La Jolla, CA

Topics to be discussed will include:

* Cyberinfrastructure partnerships: university research computing
HPC centers, XSEDE XD providers, DOE laboratories, NSF Large Facility
computing organizations and commercial cloud providers. Technologies,
strategies, ideas, and discussions on how OSG can foster partnerships
across the widest possible range of CI.

* How high throughput computing accelerates research, and how OSG can help
users scale up.

* Usability challenges and solutions for distributed high throughput
computing applications.

* Connecting virtual organizations, campus researchers and XSEDE users to
the OSG: command line, science gateways, and workflow frameworks.

* Training and education workshops.

* Serving more of the "long tail" of science with high throughput parallel
computing: incorporating multi-core, GPU and virtual cluster resources
into science workflows using shared and allocated distributed
infrastructure.

* Advanced network analytics services for national Science DMZ
infrastructure.

As has been the custom, the 2017 OSG AHM will be co-located with the U.S.
Large Hadron Collider (LHC at CERN) computing facility meetings.

Logistical information, registration and agenda are available at
https://www.opensciencegrid.org/AHM2017 (This will redirect to the
eiseverywhere.com domain.)

News and Announcements from OSG Operations > GOC Service Update - Tuesday, January 24th at 14:00 UTC

The GOC will upgrade the following services beginning Tuesday, January 24th at 14:00 UTC. The GOC reserves 8 hours in the unlikely event unexpected problems are encountered.

OASIS
* Update cvmfs packages on oasis and oasis-replica to the latest version
* Update frontier-squid on oasis-replica to the latest version
* Enable garbage collection on oasis-replica for the osgstorage.org repositories
* Remove requirement in the oasis-replica install process that external repository servers are responding

Web Services
* Update all software packages associated with OSG Web Pages.

All Services
p.p1 {margin: 0.0px 0.0px 0.0px 0.0px; font: 12.0px Consolas; -webkit-text-stroke: #000000} p.p2 {margin: 0.0px 0.0px 0.0px 0.0px; font: 12.0px Consolas; -webkit-text-stroke: #000000; min-height: 14.0px} span.s1 {font-kerning: none}
* Operating system updates; reboots will be required. The usual HA mechanisms will be used, but some services will experience brief outages.

Derek's Blog > Singularity on the OSG

Singularity is a container platform designed for use on computational resources. Several sites have deployed Singularity for their users and the OSG. In this post, I will provided a tutorial on how to use singularity on the OSG.

About Singularity

Singularity enables users to have full control of their environment. This means that a non-privileged user can “swap out” the operating system on the host for one they control.

Singularity is able to provide alternative environments for users than what is installed on the system. For example, if you have an application that installs well on Ubuntu but the system you are running on is RHEL6. In this example, you can create a Singularity image of Ubuntu, install the application, then start the image on the RHEL6 system.

Creating your first Singularity (Docker) image

Instead of making a Singualrity image as described here, we will create a docker image, then load use that in Singularity. We are using the docker image for a few reasons:

  • If you already have a docker image, then you can use this same image with Singularity.
  • If you are running your job on a docker encapsulated resource, such as Nebraska’s Tier 2, then Singularity is unable to use the default images because it is unable to acquire a loop back device inside the container.

Creating a Docker image requires root or sudo access. It usually performed on your own laptop or a machine that you own and have root access.

Docker has a great page on creating Docker images, which I won’t repeat. A simple docker image is easy to create using the very detailed instructions linked above.

Once you have uploaded the docker image to the Docker Hub, be sure to keep track of the name and version you will want to run on the OSG.

Running Singularity on the OSG

For a singularity job, you have to start the docker image in Singularity.

The submit file:

universe = vanilla
executable = run_singularity.sh
should_transfer_files = IF_NEEDED
when_to_transfer_output = ON_EXIT
Requirements = HAS_SINGULARITY == TRUE
output = out
error = err
log = log
queue

The important aspect is the HAS_SINGULARITY in the requirements. It requires that the remote node has the singularity command.

The executable script, run_singularity.sh:

#!/bin/sh -x

# Run the singularity container
singularity exec --bind `pwd`:/srv  --pwd /srv docker://python:latest python -V

The line --bind pwd:/srv binds the current working directory into the singularity container. While the command --pwd /srv changes the working directory to /srv directory when the singularity container starts. The output of the command should be the version of python installed inside the Docker image. The last arguments is the program that will run inside the docker image, ‘python -V’

You can submit this script the normal way:

$ condor_submit singularity.submit

The resulting output should state what version of Python is available in the docker image.

More complicated example

The example singularity command is very basic. It only starts the singularity image and runs the python within it. Another example which runs a python script that is brought along is below. In this example we transfer an input python script to run inside singularity. Also, we bring an output file back that was generated inside the singularity image.

#!/bin/sh -x

singularity exec --bind `pwd`:/srv  --pwd /srv docker://python:latest /usr/bin/python test.py

The contents of test.py are:

import sys
stuff = "Hello World: The Python version is %s.%s.%s\n" % sys.version_info[:3]

f = open('stuff.blah', 'w')
f.write("This is a test\n")
f.write(stuff)
f.close()

Also, it is necessary to modify the submit script to addd a new line before the queue statement:

transfer_input_files = test.py

This tells HTCondor to bring the input file test.py.

When the job completes, you should have a new file in the submission directory called stuff.blah. It will have the contents (in my case):

This is a test
Hello World: The Python version is 2.7.9

Conclusion

Singularity is a very useful tool for software environments that are too complicated to bring along for each job. It provides an isolated environment where the user can control the software, while using the computing resources of the contributing clusters.


News and Announcements from OSG Operations > Announcing OSG Software version 3.3.20

We are pleased to announce OSG Software version 3.3.20.

Changes to OSG 3.3.20 include:
- HTCondor 8.4.10*: Running in SELinux should work now, other bug fixes
- gratia-probe 1.17.2: Improved ability to report local jobs to OSG or not
- Updated to XRootD 4.5.0
- Updated gridftp-hdfs to enable ordered data
- osg-configure 1.5.4: Further updates to support ATLAS AGIS
- Ensure HTCondor-CE gratia probe is installed when installing osg-ce-bosco
- Updated to VOMS 2.0.14
- Completed conversion of packages to use systemd-tmpfiles on EL 7

Changes to the Upcoming Repository include:
- Updated to HTCondor 8.5.8*
- Added Singularity (version 2.2) as a new, preview technology
- Updated to frontier-squid 3.5.23-3.1, a technology preview of version 3

*NOTE: When updating or installing HTCondor on an EL 7 system with SELinux
enabled, make sure that policycoreutils-python is installed before HTCondor.
This dependency will be properly declared in the HTCondor RPM in the next
release.

Release notes and pointers to more documentation can be found at:

https://www.opensciencegrid.org/bin/view/Documentation/Release3/Release3320

Need help? Let us know:

https://www.opensciencegrid.org/bin/view/Documentation/Release3/HelpProcedure

We welcome feedback on this release!

News and Announcements from OSG Operations > Emergency Service Downtime for OSG Connect, CI Connect, and Stash Data Services

OSG Connect, CI Connect and Stash data services will be in an emergency downtime today due to a data center issue.  Apologies for the inconvenience.

Sincerely,
The OSG User Support Team


Subscribe