
This is a manual forward-port of the following change merged into icehouse branch: https://review.openstack.org/215119 When content of an object is modified from filesystem interface, a GET on the object will return inconsistent or incomplete content because the content length originally stored as metadata no longer reflects the actual length of the file after modification. The complete fix will have two parts: (1) Return the entire content of object as is to the client (2) The Etag returned should reflect the actual md5sum of object content This change only fixes (1) mentioned above. This means, the client will always get the complete content of the file. Fix (2) is not part of this change. This means, if content length of the object remains same even after modification, the Etag returned would be incorrect. Fixing (2) involves more invasive changes in the code. So that is deferred for now and will be sent as a separate change later. Reference: https://bugs.launchpad.net/swiftonfile/+bug/1416720 https://review.openstack.org/151897 Change-Id: I28d0ec33c59eb520be7d15a60adb968692226e3e Closes-Bug: #1416720 Signed-off-by: Prashanth Pai <ppai@redhat.com>
Swift-on-File
Swift-on-File is a Swift Object Server implementation that enables users to access the same data, both as an object and as a file. Data can be stored and retrieved through Swift's REST interface or as files from NAS interfaces including native GlusterFS, GPFS, NFS and CIFS.
Swift-on-File is to be deployed as a Swift storage policy, which provides the advantages of being able to extend an existing Swift cluster and also migrating data to and from policies with different storage backends.
The main difference from the default Swift Object Server is that Swift-on-File stores objects following the same path hierarchy as the object's URL. In contrast, the default Swift implementation stores the object following the mapping given by the Ring, and its final file path is unkown to the user.
For example, an object with URL: https://swift.example.com/v1/acct/cont/obj
,
would be stored the following way by the two systems:
- Swift:
/mnt/sdb1/2/node/sdb2/objects/981/f79/f566bd022b9285b05e665fd7b843bf79/1401254393.89313.data
- SoF:
/mnt/swiftonfile/acct/cont/obj
Use cases
Swift-on-File can be especially useful in cases where access over multiple protocols is desired. For example, imagine a deployment where video files are uploaded as objects over Swift's REST interface and a legacy video transcoding software access those videos as files.
Along the same lines, data can be ingested over Swift's REST interface and then analytic software like Hadoop can operate directly on the data without having to move the data to a separate location.
Another use case is where users might need to migrate data from an existing file storage systems to a Swift cluster.
Similarly, scientific applications may process file data and then select some or all of the data to publish to outside users through the swift interface.
Limitations and Future plans
Swift-On-File currently works only with Filesystems with extended attributes support. It is also recommended that these Filesystems provide data durability as Swift-On-File should not use Swift's replication mechanisms.
GlusterFS and GPFS are good examples of Filesystems that work well with Swift-on-File. Both provide a posix interface, global namespace, scalability, data replication and support for extended attributes.
Currently, files added over a file interface (e.g., native GlusterFS), do not show up in container listings, still those files would be accessible over Swift's REST interface with a GET request. We are working to provide a solution to this limitation.
Because Swift-On-File relies on the data replication support of the filesystem the Swift Object replicator process does not have any role for containers using the Swift-on-File storage policy. This means that Swift geo replication is not available to objects in in containers using the Swift-on-File storage policy. Multi-site replication for these objects must be provided by the filesystem.
Future plans includes adding support for Filesystems without extended attributes, which should extend the ability to migrate data for legacy storage systems.
Get involved:
To learn more about Swift-On-File, you can watch the presentation given at the Paris OpenStack Summit: Deploying Swift on a File System. The Paris presentation slides can be found here Also see the presentation given at the Atlanta Openstack Summit: Breaking the Mold with Openstack Swift and GlusterFS. The Atlanta presentation slides can be found here.
Join us in contributing to the project. Feel free to file bugs, help with documentation or work directly on the code. You can file bugs or blueprints on launchpad
or find us in the #swiftonfile channel on Freenode.